Where to go from here?
======================

Learning a programming language is the first step towards becoming a computationalists who advances science and engineering through computational modelling and simulation.

We list some additional skills that can be very beneficial for day-to-day computational science work, but is of course not exhaustive.

Advanced programming
--------------------

This text has put emphasis on providing a robust foundation in terms of programming, covering control flow, data structures and elements from function and procedural programming. We have not touch Object Orientation in great detail, nor have we discussed some of Python’s more advanced features such as iterators, and decorators, for example.

Compiled programming language
-----------------------------

When performance starts to be the highest priority, we may need to use compiled code, and likely embed this in a Python code to carry out the computational that are the performance bottle neck.

Fortran, C and C++ are sensible choices here; maybe Julia in the near future.

We also need to learn how to integrate the compiled code with Python using tools such as Cython, Boost, Ctypes and Swig.

Testing
-------

Good coding is supported by a range of unit and system tests that can be run routinely to check that the code works correctly. Tools such as doctest, nose and pytest are invaluable, and we should learn at least how to use pytest (or nose).

Simulation models
-----------------

A number of standard simulation tools such as Monte Carlo, Molecular Dynamics, lattice based models, agents, finite difference and finite element models are commonly used to solve particular simulation challenges – it is useful to have at least a broad overview of these.

Software engineering for research codes
---------------------------------------

Research codes bring particular challenges: the requirements may change during the run time of the project, we need great flexibility yet reproducibility. A number of techniques are available to support effectively.

Data and visualisation
----------------------

Dealing with large amounts of data, processing and visualising it can be a challenge. Fundamental knowledge of database design, 3d visualisation and modern data processing tools such as the Pandas Python package help with this.

Version control
---------------

Using a version control tool, such as git or mercurial, should be a standard approach and improves code writing effectiveness significantly, helps with working in teams, and - maybe most importantly - supports reproducibility of computational results.

Parallel execution
------------------

Parallel execution of code is a way to make it run orders of magnitude faster. This could be using MPI for inter-node communication or OpenMP for intra-node parallelisation or a hybrid mode bringing both together.

The recent rise of GPU computing provides yet another avenue of parallelisation, and so do the many-core chips such as the Intel Phi.

### Acknowledgements

Big thanks go to

-   Marc Molinari for carefully proof reading this manuscript around 2007.

-   Neil O’Brien for contributing to the SymPy section.

-   Jacek Generowicz for introducing me to Python in the last millennium, and for kindly sharing countless ideas from his Python course.

-   EPSRC (GR/T09156/01 and EP/G03690X/1) and the European Union (OpenDreamKit Horizon 2020 European Research Infrastructures project, #676541) for support.

-   Students and other readers who have provided feedback and pointed out typos and errors etc.

-   Thomas Kluyver who helped to translate the Python 2 LaTeX based document into Python 3 Jupyter Notebooks and provided the machinery to create html and pdf versions automatically (via his [bookbook package](https://github.com/takluyver/bookbook)).

[1] the vertical line is to show the division between the original components only; mathematically, the augmented matrix behaves like any other 2 × 3 matrix, and we code it in SymPy as we would any other.

[2] from the <span>`help(preview)`</span> documentation: “Currently this depends on pexpect, which is not available for windows.”

[3] The exact value for the upper limit is availabe in `sys.maxint`.

[4] We add for completeness, that a C-program (or C++ of Fortran) that executes the same loop will be about 100 times faster than the python float loop, and thus about 100\*200 = 20000 faster than the symbolic loop.

[5] In this text, we usually import `numpy` under the name `N` like this: `import numpy as N`. If you don’t have `numpy` on your machine, you can substitute this line by `import Numeric as N` or `import numarray as N`.

[6] Historical note: this has changed from scipy version 0.7 to 0.8. Before 0.8, the return value was a float if a one-dimensional problem was to solve.

然后去哪儿？
======================

学习编程语言是成为计算专家的第一步，他可以通过计算建模和仿真来推动科学和工程学的发展。

我们列出了一些其他的技能，这些技能对于日常的计算科学工作可能非常有益，但是当然并不详尽。

高级编程
--------------------

本文着重于在编程方面提供坚实的基础，涵盖功能和过程编程的控制流，数据结构和元素。我们没有详细介绍“对象定向”，也没有讨论过Python的一些更高级的功能，例如迭代器和装饰器。

编译语言
-----------------------------

当性能开始成为最高优先级时，我们可能需要使用编译后的代码，并且可能将其嵌入Python代码中以进行性能瓶颈。

Fortran，C和C ++是明智的选择。也许朱莉娅在不久的将来。

我们还需要学习如何使用Cython，Boost，Ctypes和Swig等工具将编译后的代码与Python集成。

测试中
-------

一系列单元测试和系统测试可支持良好的编码，这些测试可以例行运行以检查代码是否正常运行。诸如doctest，nose和pytest之类的工具非常宝贵，我们至少应学习如何使用pytest（或鼻子）。

仿真模型
-----------------

通常使用许多标准的仿真工具来解决特定的仿真挑战，例如蒙特卡洛（Monte Carlo），分子动力学，基于晶格的模型，代理，有限差分和有限元模型-至少对这些挑战有一个大致的了解。

研究代码的软件工程
---------------------------------------

研究代码带来了特殊的挑战：在项目的运行期间需求可能会发生变化，我们需要极大的灵活性和可重复性。有许多技术可以有效地支持。

数据和可视化
----------------------

处理大量数据，对其进行处理和可视化可能是一个挑战。数据库设计，3d可视化和现代数据处理工具（例如Pandas Python软件包）的基础知识可以帮助您解决这一问题。

版本控制
---------------

使用版本控制工具（例如git或mercurial）应该是一种标准方法，可以显着提高代码编写效率，有助于团队合作，并且-最重要的是-支持计算结果的可重复性。

并行执行
------------------

并行执行代码是一种使其运行速度快几个数量级的方法。这可能是使用MPI进行节点间通信，或者使用OpenMP进行节点内并行化，或者是将两者结合在一起的混合模式。

GPU计算的最新兴起为并行化提供了又一条途径，英特尔Phi等多核芯片也是如此。

### 致谢

非常感谢

- 马克·莫利纳里（Marc Molinari）在2007年左右仔细阅读了这份手稿。

- Neil O’Brien对SymPy部分的贡献。

- Jacek Generowicz在上个千年向我介绍了Python，并慷慨地分享了他在Python课程中的无数想法。

- EPSRC（GR / T09156 / 01和EP / G03690X / 1）和欧盟（OpenDreamKit Horizo​​n 2020欧洲研究基础设施项目，＃676541）提供支持。

- 提供反馈并指出错别字和错误等的学生和其他读者。

- Thomas Kluyver，他帮助将基于Python 2 LaTeX的文档转换为Python 3 Jupyter Notebooks，并提供了自动创建html和pdf版本的机制（通过他的[bookbook package]（https://github.com/takluyver/bookbook）） 。

[1]垂直线仅显示原始组件之间的划分；在数学上，增强矩阵的行为类似于任何其他2××3矩阵，我们像其他任何矩阵一样在SymPy中对其进行编码。

<span>`help（preview）`</span>文档中的[2]：“目前，这取决于pexpect，而Windows无法使用。”

[3]上限的确切值在sys.maxint中可用。

[4]为了完整起见，我们添加了执行同一循环的C程序（或Fortran的C ++）将比python float循环快约100倍，因此比符号循环快约100 \ * 200 = 20000 。

[5]在本文中，我们通常以“ N”的名称导入“ numpy”，如下所示：“将numpy导入为N”。如果您的计算机上没有“ numpy”，则可以通过“将数字导入为N”或“将数字数组导入为N”来替换此行。

[6]历史注释：这已从scipy版本0.7更改为0.8。在0.8之前，如果要解决一维问题，则返回值为浮点数。