LRG Model Simulation Code Documentation

Note that this is an initial draft of both the final version of the code and documentation. Any contents herein are subject to change.

Introduction

This package contains a combination of python, C++, CUDA, and julia code to run the trade model simulation of the Long-run-growth project. The code is organized in such a way that the most performance sensitive and technical aspects of the codebase are handled in sufficient rigor by the lower level code while still allowing easy access to the overall model from a high-level, Python, object-based interface. This is done to combine the benefits of multiple languages and coding paradigms such that the code can be as performant as possible while still being accessible--at least at a high level--by somebody with minimal experience in low level and/or mathematical programming.

This documentation will cover, on one end, the high level interface to access the model and data through python, while also detailing the necessary steps to compile and run the entire codebase, and, finally, the more technical implementation details to allow for modification of any layer of the modular codebase.

Installation

Requirements

git
Python 3.11 or greater
A python virtual environment manager
- i.e. pyenv w/ pyenv-virtualenv or conda
gcc compiler
GNU Make
For full GPU accelerated features:
- A CUDA compatible GPU
- A CUDA installation including:
  - cublas v2
  - cudart
- nvcc compiler

Note that this installation procedure should be identical across most Unix systems (Linux/MacOS) however in order to properly install and run the package on Windows several changes would need to be made, in particular to the compilation and loading of the C++ shared libraries.

Walkthrough

Clone the repo into the desired location
Install the required python packages (ideally within a virtual environment):

numpy
juliacall

From the project root directory run make all in the terminal to compile the C++/CUDA source code into shared libraries (.so files) in the lib folder
If not already present make a folder called data in project root directory, this is where input and output data will be placed for the model
To call the model from the associated python module either create a python script in the src directory and run it from the project root directory with python src/<script_name.py> or load model interface from a REPL or interactive shell started from the project root directory (in which case the core module should be imported with from src.sim.core import *)

Usage

Initialization

Within a script, the core model implementation can be imported with

from sim.core import *

Then, it is necessary to create a Model object with

<model_name> = Model(<model_size>)

Where model_size is either an integer variable or literal representing the number of active nodes within the model.

Data Storage and Access

Upon creation, the Model object will allocate the necessary memory to hold its data through the C++ API, these data arrays are exposed to the end Python user through the use of numpy arrays that are members of the respective Model object and can be accessed both for reference and assignment through

<model_name>.<array_name>

The currently accessible arrays are:

A -- the vector of cell knowledge endowments
L -- the vector of cell labor (population) endowments
B -- the vector of cell inherent fertility values
Y -- the vector of cell total incomes
P -- the vector of cell ...
Pi -- the vector of cell ...
t -- the matrix (dim 2 array) of direct inter-cell travel costs
tau -- the matrix (dim 2 array) of globally aggregated inter-cell travel costs
X -- the matrix (dim 2 array) of ...
Xi -- the matrix (dim 2 array) of ...

Note that members 1-6 are of length <model_size> and 7-10 are technically of length <model_size> * <model_size> but in practical terms should be accessed as nested arrays, i.e. to get the direct travel cost between cells 100 and 101 call <model_name>.t[100][101].

Data I/O

Once initialized, the Model member arrays can be accessed, copied, or modified at will the same way any standard numpy array would be. This allows for a high degree of flexibility to integrate this interface into any desired codebase or processing pipeline. However some simple data i/o is provided within the sim.io module.

from sim.io import *

Data can be loaded from a csv file with the function read_array(filepath: str, array: numpy.ndarray, array_len: int, indexed: bool = False).

The variable indexed is an optional boolean stating whether the csv file is indexed or not. By default the function will assume an unindexed csv file with a single floating point number on each line. If the csv is indexed (i.e. in the form 1,5.2 for a single line) then pass True to the indexed variable in the function call.

Example: read_array('data/A.csv', my_model.A, 1861, True)

Data can be written to a csv file with the function write_array(filepath: str, array: numpy.ndarray, array_len: int).

Example: write_array('data/P.csv', my_model.P, 1861)

Similarly, the read_matrix(filepath: str, matrix: numpy.ndarray, array_len: int) and write_matrix(filepath: str, matrix: numpy.ndarray, array_len: int) functions can be used to read or write matrices to and from a csv file of the form <i>,<j>,<value>

Example: read_matrix('data/t.csv', my_model.t, 1861)

Model Simulation

Calculating Tau matrix

Once the relevant model parameters have been loaded, tau values can be calculated from the input t values through the class method

<model_name>.calc_tau(coeff_theta: float, n_iterations: int, debug_level: int)

A theta coefficient is required as well as the desired number of iterations of the tau approximation algorithm (for more details see the technical implementation section) and the debug level which should be an integer between 0 and 3 (default 0) representing the desired detail of debug output in the terminal.

Example:

model.calc_tau(3.0, 100, 1)

Calculating P and Pi vectors

As the calculation of the optimal P and Pi vectors is quite sensitive a thorough initial solution should be found using a powerful non-linear optimization algorithm. This is currently implemented in julia and called from the python interface through

<model_name>.init_p_pi(coeff_theta: str)

As this may take a significant amount of time to run, if multiple simulations of the model with the same starting conditions are expected it is recommended to save and load the results of this initial P/Pi calculation to speed up future runs of the model.

After the initial calculation, a much faster albeit less accurate iterative algorithm can be called to update the P and Pi vectors with

<model_name>.calc_p_pi(coeff_theta: str, diff_limit: float, debug_level: int, relative_diff: bool = False)

The diff_limit variable represents the desired level of iterative 'stability' for the calculation, i.e. the algorithm will iteratively solve for values of P and Pi until the largest difference in resulting values between iterations is smaller than this number. The relative_diff boolean flag (default False) indicates whether this diff_limit should be applied to the absolute or relative difference.

Calculating Additional Values

Other calculation of model variables can be accessed as class methods including (along with their relevant parameters):

calc_Y(coeff_theta, debug_level)
calc_X(coeff_theta, debug_level)
calc_Xi(coeff_theta, debug_level)

Updating Core Model Parameters

Additional functions are provided to allow for updating the core parameters of the model. They can be accessed as class methods:

update_A(coeff_eta, coeff_beta, coeff_sigma, debug_level, normalized, log_A, translate_A
- The normalized flag is an optional boolean value (default False). When True it sets all Xi values for adjacent nodes to be a value between 0 and 1 scaled by the sum of all Xi values.
- The log_A flag is an optional boolean value (default True). When True it scales the increase in A to log levels (normalized by the A value of the source node). When False it simply grows A additively in linear terms based on the result of the growth equation.
- The translate_A flag is an optional boolean value (default True). When True it scales the effective A values used in calculating their influence on neighbors so that the minimum value is always zero. This means that nodes with very low (for technical reasons nodes must have nonzero A) have no influence on generating growth in A.
update_L(coeff_a, coeff_b, coeff_f, coeff_d, coeff_xi, coeff_lambda, debug_level)
update_t(coeff_chi, debug_level)

Technical implementation details

TODO o

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
SolveInitial		SolveInitial
cpp		cpp
data		data
sim		sim
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
compile_flags.txt		compile_flags.txt
juliapkg.json		juliapkg.json
pyrightconfig.json		pyrightconfig.json
test_julia.jl		test_julia.jl
test_sims.py		test_sims.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LRG Model Simulation Code Documentation

Table of contents

Introduction

Installation

Requirements

Walkthrough

Usage

Initialization

Data Storage and Access

Data I/O

Model Simulation

Calculating Tau matrix

Calculating P and Pi vectors

Calculating Additional Values

Updating Core Model Parameters

Technical implementation details

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LRG Model Simulation Code Documentation

Table of contents

Introduction

Installation

Requirements

Walkthrough

Usage

Initialization

Data Storage and Access

Data I/O

Model Simulation

Calculating Tau matrix

Calculating P and Pi vectors

Calculating Additional Values

Updating Core Model Parameters

Technical implementation details

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages