# Numerical Software Development

**Objectives**: This course discusses the art to build numerical software, i.e., computer programs applying numerical methods for solving mathematical or physical problems.  We will be using the combination of Python and C++ and related tools (e.g., [bash](https://www.gnu.org/software/bash/), [git](https://git-scm.com), [make](https://www.gnu.org/software/make/), etc.) to learn the modern development processes.  By completing this course, students will acquire the fundamental skills for developing modern numerical software.

**Prerequisites**:  This is a graduate or senior level course open to students who have taken computer architecture, engineering mathematics or equivalents.  Working knowledge of Linux and Unix-like is required.  Prior knowledge to numerical methods is recommended.  The instructor uses English in the lectures and discussions.

# How to study

* This is a practical course.  No textbook is available for this specific interdisciplinary subject.
* To study the subject, students are required to research with online documents and source code, and write programs to practice.
* In-class instruction and [course notes](https://github.com/yungyuc/nsd) are provided for guidance.
* References:
  * Computer Systems: A Programmer's Perspective, Randal E. Bryant and David R. O'Hallaron: https://csapp.cs.cmu.edu/
  * Python documentation: https://docs.python.org/3/
  * Cppreference: https://en.cppreference.com/
  * [Effective Modern C++](https://www.oreilly.com/library/view/effective-modern-c/9781491908419/), Scott Meyer, O'Reilly, 2014
  * Source code: [cpython](https://github.com/python/cpython), [numpy](https://github.com/numpy/numpy), [xtensor](https://github.com/QuantStack/xtensor), and [pybind11](https://github.com/pybind/pybind11)

# Course design

* You are expected to learn programming languages yourself.  Python is never a problem, but you could find it challenging to self-teach C++.  Students are encouraged to form study groups for practicing C++, and discuss with the instructor and/or the teaching assistant.
* Grading: homework 30%, mid-term exam: 30%, term project: 40%.
* There are 13 or 14 lectures for the subjects of numerical software developing using Python and C++.
* There will be 6 homework assignements for you to exercise.  Programming in Python and/or C++ is required.
* Mid-term examination will be conducted to assess students' understandings to the analytical materials.
* Term project will be used to assess students' overall coding skills.  Presentation is required.  Failure to present results in **0 point** for this part.  Check [the term project page](term_project.ipynb) before you start.

# Course schedule

* W1 (2/17) Lecture 1: Introduction
* W2 (2/24) Lecture 2: Fundamental engineering practices (homework \#1)
* W3 (3/2) Lecture 3: Python and numpy
* W4 (3/9) Lecture 4: C++ and computer architecture (homework \#2)
* W5 (3/16) Lecture 5: Matrix operations
* W6 (3/23) Lecture 6: Cache optimization (homework \#3)
* W7 (3/30) Lecture 7: SIMD
* W8 (4/6) No class; activity week
* W9 (4/13) Mid-term examination
* W10 (4/20) Lecture 8: Memory management (homework \#4)
* W11 (4/27) Lecture 9: Smart pointer
* W12 (5/4) Lecture 10: Modern C++ (homework \#5)
* W13 (5/11) Lecture 11: C++ and C for python
* W14 (5/18) Lecture 12: Arrays in C++ (homework \#6)
* W15 (5/25) Lecture 13: Array-oriented design
* W16 (6/1) Lecture 14: tbd
* W17 (6/8) Term project presentation 1/2
* W18 (6/15) Term project presentation 2/2

# Lecture 1 [Introduction](01_introduction/introduction.ipynb)

# Lecture 2 [fundamental engineering practices](02_engineering/engineering.ipynb)

Writing computer code is only a fraction of software engineering.  A large chunk of efforts is spent in the coding infrastructure.  The keyword of making the engineering system is automation.

1. Automation
   1. Bash scripting
   2. Makefile
   3. Cmake (cross-platform, multi-language automation)
2. Version control and regression
   1. Git version control system
   2. Automatic testing: author and run with google-test and py.test
   3. Wrap to Python and test there: pybind11
   4. Continuous integration to avoid regression
3. Work that cannot be automated
   1. Code review (use github for demonstration)
   2. Timing to debug for performance
      1. Wall time and CPU cycles
      2. System time and user time
      3. Python timing tools

([bare notebook](02_engineering/engineering_bare.ipynb))

# Lecture 3 [Python and numpy](03_numpy/numpy.ipynb)

Numerical software is always developed as a platform.  It works like a library providing data structures and helpers to solve problems.  The users will use a scripting engine it provides to build applications.  Python is a popular choice for the scripting engine.

1. Organize Python modules
   1. Scripts
   2. Modules
   3. Package
2. Use numpy for array-oriented code
   1. Data type
   2. Construction
   3. Multi-dimensional arrays
   4. Selection
   5. Broadcasting
3. Use tools for numerical analysis
   1. Jupyter notebook
   2. Matplotlib
   3. Linear algebra using numpy ans scipy
   4. Package management wtih conda and pip

([bare notebook](03_numpy/numpy_bare.ipynb))

# Lecture 2 [C++ and computer architecture](02_cpp/cpp.ipynb)

The low-level code of numerical software must be high-performance.  The industries chose C++ because it can take advantage of everything that a hardware architecture offers while using any level of abstraction.

1. Fundamental data types
   1. Command-line interface for compiler tools
      1. Compiler, linker
      2. Multiple source files, separation of declaration and definition, external libraries
      3. Build multiple binaries and shared objects (dynamically linked libraries)
   2. Integer, signness, pointer, array indexing
   3. Floating-point, rounding, exception handling
   4. Numeric limit
2. Object-oriented programming
   1. Class, encapsulation, accessor, reference type
   2. constructor and destructor
   3. Polymorphism and RTTI
   4. CRTP
3. Standard template library (STL)
   1. std::vector, its internal and why the buffer address is dangerous
   2. std::array, std::list
   3. std::map, std::set, std::unordered_map, std::unordered_set

([bare notebook](02_cpp/cpp_bare.ipynb))

# Lecture 4 [memory management](04_mem/mem.ipynb)

Numerical software tends to use as much memory as a workstation has.  The memory has two major uses: (i) to hold the required huge amount of data, and (ii) to gain speed.

1. Linux memory model: stack, heap, and memory map
2. C memory management API
3. C++ memory management API
4. STL allocator API
3. Object counter

([bare notebook](04_mem/mem_bare.ipynb))

# Lecture 5 [matrix operations](05_matrix/matrix.ipynb)

As linear algebra is fundamental in almost everything uses mathematics, matrices are everywhere in numerical analysis.  There isn't shortage of linear algebraic software packages and it's critically important to understand how they work.

1. POD arrays and majoring
2. Matrix-vector and matrix-matrix operations
3. Linear algebra

([bare notebook](05_matrix/matrix_bare.ipynb))

# Lecture 6 [cache optimization](06_cache/cache.ipynb)

1. How cache works and its importance to performance
2. Stride analysis
3. Tiling

([bare notebook](06_cache/cache_bare.ipynb))

# Lecture 7 [SIMD: vector processing](07_simd/simd.ipynb)

1. Types of parallelism.
2. x86 intrinsic funcions.
3. Inspect assembly.

([bare notebook](07_simd/simd_bare.ipynb))

# Lecture 8 [modern C++ I: ownership and smart pointers](08_09_moderncpp/moderncpp1.ipynb)

1. Pointers and ownership
    1. Raw pointer
    2. Reference
    3. Ownership
    4. Smart pointers
        1. `unique_ptr`
        2. `shared_ptr`
2. Revisit shared pointer
    1. Make Data exclusively managed by `shared_ptr`
    2. Get `shared_ptr` from `this`
    3. Cyclic reference and `weak_ptr`

([bare notebook](08_09_moderncpp/moderncpp1_bare.ipynb))

# Lecture 9 [modern C++ II: more than templates](08_09_moderncpp/moderncpp2.ipynb)

1. Copy elision / return value optimization
2. Move semantics and copy elision
    1. Forced move is a bad idea
3. Data concatenation
    1. Style 1: return `vector`
    2. Style 2: use output `vector`
    3. Style 3: use a class for both return and output argument
4. Variadic template
5. Perfect forwarding
6. Lambda expression
    1. Keep a lambda in a local variable
    2. Difference between `auto` and `std::function`
7. Closure
    1. Comments on functional style

([bare notebook](08_09_moderncpp/moderncpp2_bare.ipynb))

# Lecture 10 [xtensor: arrays in C++](10_xtensor/xtensor.ipynb)

1. Python is slow but easy to write
2. Speed up by using numpy (still in Python)
3. Xtensor: write iterative code in C++ speed using arrays
4. Effect of house-keeping code

([bare notebook](10_xtensor/xtensor_bare.ipynb))

# Lecture 11 [pybind11: binding between Python and C++](11_pybind/pybind.ipynb)


1. Why do we use scripting
   1. Research code
   2. Full-fledged application
   3. Scripting for modularization
2. Pybind11 build system
   1. Setuptools
   2. Cmake with a sub-directory
   3. Cmake with install pybind11
3. Additional wrapping layer for customization
4. Wrapping API
   1. Functions and property
   2. Named ane keyword arguments
   3. What happens in Python stays in Python (or pybind11)
5. See how Python plays
   1. Linear wave
   2. The inviscid Burgers equation
6. Manipulate Python objects in C++
7. Python containers
   1. `tuple`
   2. `list`
   3. `dict`

([bare notebook](11_pybind/pybind_bare.ipynb))

# Lecture 12 [cpython API: operate Python from C](12_cpython/cpython.ipynb)

1. Use cpython API with pybind11
2. `PyObject` reference counting
3. Built-in types
   1. Cached value
   2. Attribute access
   3. Function call
   4. Tuple
   5. Dictionary
   6. List
4. Useful operations
   1. Import
   2. Exception
5. Python memory management
   1. PyMem interface
   2. Small memory optimization
   3. Tracemalloc

([bare notebook](12_cpython/cpython_bare.ipynb))

# Lecture 13 [array-oriented design](13_arraydesign/arraydesign.ipynb)

1. Design interface with arrays
2. Conversion between dynamic and static semantics
3. Insert profiling code

([bare notebook](13_arraydesign/arraydesign_bare.ipynb))

[Overflow topics](overflow.ipynb)