# Running Python or C Code from Julia

This notebook provides a basic introduction to how to run Python, using the `PyCall.jl` package and also C code (requires no package) from Julia. Please see the [PyCall.jl](https://github.com/JuliaPy/PyCall.jl) homepage for instructions for how to either use an existing Python installation, or let the package make one.
 
You need Python's `statsmodels` package installed to run the code below. If you have let PyCall install Python for you, use [Conda.jl](https://github.com/JuliaPy/Conda.jl) to add packages: `import Conda; Conda.add("statsmodels")`.

An alternative package (not used here) for running Python is [PythonCall.jl](https://github.com/cjdoris/PythonCall.jl), which seems to be gaining popularity.

Another notebooks discusses how to run R code.

In [1]:
using Printf, DelimitedFiles
include("src/printmat.jl");

# Load Data

In [2]:
x = readdlm("Data/MyData.csv",',',skipstart=1)  #reading the csv file

(Rme,Rf,R) = (x[:,2],x[:,3],x[:,4])  #creating variables from columns of x
y  = R - Rf                          #do R .- Rf if R has several columns

c = ones(length(Rme))
x = [c Rme]

b = x\y
println("OLS coeffs according to Julia")
printmat(b)

OLS coeffs according to Julia
    -0.504
     1.341



# Python

In the next cells we *(a)* load the PyCall.jl package and activates the (Python) package `statsmodels`; *(b)* call some functions (eg. `OLS()`) from statsmodels.

In [3]:
using PyCall
sm = pyimport("statsmodels.api");     #activate this package and call it `sm`

In [4]:
resultsP = sm.OLS(y, x).fit()        #can use Python functions directly

println(resultsP.summary())

PyObject <class 'statsmodels.iolib.summary.Summary'>
"""
                            OLS Regression Results                            
Dep. Variable:                      y   R-squared:                       0.519
Model:                            OLS   Adj. R-squared:                  0.518
Method:                 Least Squares   F-statistic:                     416.2
Date:                Mon, 25 Nov 2024   Prob (F-statistic):           2.72e-63
Time:                        09:42:56   Log-Likelihood:                -1241.7
No. Observations:                 388   AIC:                             2487.
Df Residuals:                     386   BIC:                             2495.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------

In [5]:
println(keys(resultsP))              #print all keys (field names)

[:HC0_se, :HC1_se, :HC2_se, :HC3_se, :_HCCM, :__class__, :__delattr__, :__dict__, :__dir__, :__doc__, :__eq__, :__format__, :__ge__, :__getattribute__, :__getstate__, :__gt__, :__hash__, :__init__, :__init_subclass__, :__le__, :__lt__, :__module__, :__ne__, :__new__, :__reduce__, :__reduce_ex__, :__repr__, :__setattr__, :__sizeof__, :__str__, :__subclasshook__, :__weakref__, :_abat_diagonal, :_cache, :_data_attr, :_data_in_cache, :_get_robustcov_results, :_get_wald_nonlinear, :_is_nested, :_transform_predict_exog, :_use_t, :_wexog_singular_values, :aic, :bic, :bse, :centered_tss, :compare_f_test, :compare_lm_test, :compare_lr_test, :condition_number, :conf_int, :conf_int_el, :cov_HC0, :cov_HC1, :cov_HC2, :cov_HC3, :cov_kwds, :cov_params, :cov_type, :df_model, :df_resid, :diagn, :eigenvals, :el_test, :ess, :f_pvalue, :f_test, :fittedvalues, :fvalue, :get_influence, :get_prediction, :get_robustcov_results, :info_criteria, :initialize, :k_constant, :llf, :load, :model, :mse_model, :mse_re

In [6]:
b_P = resultsP.params                #the numerical results are now a Julia vector

printblue("Comparing the estimates in Julia and Python:")
printmat([b b_P];colNames=["Julia","Python"])

[34m[1mComparing the estimates in Julia and Python:[22m[39m
     Julia    Python
    -0.504    -0.504
     1.341     1.341



In [7]:
#we can run blocks of code like this, notice: $x and $y
py"""
import numpy as np
xx = np.matmul(np.matrix.transpose($x),$x)
xy = np.matmul(np.matrix.transpose($x),$y)
b_p = np.linalg.solve(xx,xy)
"""

py"b_p"               #to print the result

2-element Vector{Float64}:
 -0.5041626034967046
  1.3410486453848383

# C

This section shows some simple examples of how to call a C function. The functions are in the file `My_C_Stuff.c` (printed in the next cell). The first function `c_dot` defines a dot product between two vectors and the second function `c_ols` a simple linear regression.

In [8]:
println(read("Data/My_C_Stuff.c",String))

#include <stddef.h>

// calculate the inner (dot) product of vectors Y and Y, returns the result (Sxy)
double c_dot(size_t n, double *Y, double *X) {
    double Sxy = 0.0;
    for (size_t i = 0; i < n; ++i) {
        Sxy += X[i]*Y[i];
    }
    return Sxy;
}

// calculate a simple regression, Y = a + b*X + u, puts (a,b) in vector ab, returns nothing
void c_ols(size_t n, double *Y, double *X, double *ab) {
    double Sx = 0.0, Sy = 0.0, Sxx = 0.0, Sxy = 0.0;
    for (size_t i = 0; i < n; ++i) {
        Sx  += X[i];
        Sy  += Y[i];
        Sxx += X[i]*X[i];
        Sxy += X[i]*Y[i];
    }
    ab[1] = (Sxy-Sx*Sy/n)/(Sxx-Sx*Sx/n);   //slope
    ab[0] = (Sy - ab[1]*Sx)/n;             //intercept
}


To compile to a dynamlic library (dll on windows), I use gcc (for x86_64) from [mingw-64](http://mingw-w64.org)
and run the following in the mingw terminal
```
gcc -shared -fPIC My_C_Stuff.c -o My_C_Stuff.dll
```

To call the C functions, place the dll file in the current folder and then run the following cells.

In [9]:
mylibc = "My_C_Stuff.dll"
x2     = x[:,2];               #get a vector with the regressor values

## A Function which Returns a Number

In the next example, we a function `c_dot` in `My_C_Stuff.dll`. The function calculates the inner product of two vectors.

The details are:
1. `mylibc.c_dot` is the library.function
2. `length(y)::Csize_t` is the first input and its type (an integer indicating the number of elements in `y`)
3. `y::Ptr{Float64}` is the second input (a pointer to an array of Floats) and similarly for `x2`
4. `Float64` is the type of the output

(We could potentially wrap this in a Julia function that checks for the right input types and outputs the result.)

In [10]:
z = @ccall mylibc.c_dot(length(y)::Csize_t, y::Ptr{Float64}, x2::Ptr{Float64})::Float64

printlnPs("The inner product of x2 and y in Julia and C:  ",x2'y," ",z)

The inner product of x2 and y in Julia and C:   11071.648           11071.648


## A Function which Returns a Vector

The details are as above, except that 
1. `mylibc.c_ols` is the library.function
2. `Cvoid` is the type of the output, which here indicates that the function does not have an output. Rather, the function modifies the vector `b_c` by putting the OLS results there.

In [11]:
b_c = zeros(2)          #where C will store the regression results

@ccall mylibc.c_ols(length(y)::Csize_t, y::Ptr{Float64}, x2::Ptr{Float64}, b_c::Ptr{Float64})::Cvoid

println("Comparing the estimates in Julia and C")
printmat([b b_c];colNames=["Julia","C"])

Comparing the estimates in Julia and C
     Julia         C
    -0.504    -0.504
     1.341     1.341

