Skip to content
This repository has been archived by the owner on May 28, 2020. It is now read-only.

For New Coders

Patrick Fuller edited this page Jan 11, 2015 · 2 revisions

If you are new to programming, avoid the urge to copy-paste the commands in this wiki. Instead, take the time to understand what each command is doing and why it is written the way it is. This process will be slower at first, but taking the time to learn will seriously pay off in the long term.

If you are looking for educational resources, I highly recommend Software Carpentry. This organization is focused solely on improving programming literacy in hard academic science. I like them because they understand that "scientists view programming as a tax they have to pay in order to do science" - this makes their tutorials succinct without going too deep into computer science theory. If you find Software Carpentry too slow, you may also be interested in Learn Code the Hard Way.

If you are looking for practice, websites such as codecademy and Project Euler may also be useful.

####Text Editors

As you learn coding, you'll start seeing text editors discussed everywhere (obligatory xkcd). When you're starting out, don't get too wrapped up in this. It's just a text editor. Get a basic text editor with code highlighting: windows has notepad++, mac has text wrangler, and linux has gedit installed by default.

Once you have some practice coding, I would recommend trying out an integrated development environment, or IDE. These are text editors + a range of features that make coding easier. There are a ton out there: atom, eclipse, pycharm, wing, and many more. Play around with them and see what you like.

IDEs are great, and many good programmers use them full time. However, there is a sizable group of people that has taken a side in the vim vs. emacs editor war. I personally use vim, but it doesn't really matter. What matters much more than the editor you use is the quality of code you write. Don't get too obsessed with the tools.

####Package Managers

As you learn more about coding, you will find yourself having to install and uninstall hundreds to thousands of small programs, called "packages" or "libraries". If you install these the same way you install programs like MS Office (e.g. download executable and run), you will overwhelm yourself very quickly. Luckily, package managers exist. A package manager is a tool that greatly simplifies installing, uninstalling, and upgrading other packages. For example, to upgrade every single program on a Debian Linux computer, you simply type sudo apt-get upgrade into a terminal. Very useful when you have hundreds of programs to manage.

While there has been a push to have only one package manager that controls everything, the current state of programming is more divided. Most programming languages come with their own specialized package managers, and each operating system additionally has its own package manager. This means that you will probably have a couple of package managers to handle. This is not as simple as one manager, but it is still much better than installing everything manually.

Below, I will break package managers into Python-specific and OS-specific. Wherever possible, I'm providing you the most popular way to do things without getting too much into the why. That being said, there is some explanation.

####Virtual Environments

An important aspect of coding is being able to reproduce results. This may seem easy (why not just re-run code?), but you will quickly find that the same code behaves differently across operating system, language, and package versions. A half-baked solution is to force users to fix the versions of everything on their computer, but this can break other programs on computers and generally discourages code development. The solution is to use something called "virtual environments", which enable you to manage multiple versions of the same packages. Package managers often come with virtual environment managers, and can be used very easily.

#Programming Language

####Python

The Python community is currently split into two versions: Python 2 and Python 3. Luckily, RASPA2 can be run with both Python 2 and Python 3, and the package manager we will install can easily switch between versions. Try using Python 3 for every project, and only switch to Python 2 when necessary.

The package manager we will install is called "conda". To install:

  • Go to this site and download the Python 3.4 version for your OS (install Python 3.4 even if you need Python 2 - I'll show you why later).
  • Run it by either double-clicking or typing bash Miniconda-*.sh.
  • Make sure to select "yes" for every option.
  • To test, reboot your terminal and type which python. It should point to ~/miniconda3/bin/python.

We can now use conda to install, uninstall, and upgrade Python packages in one-line commands. Let's grab two of the most commonly used Python packages: numpy and ipython. To install both packages, run the command:

conda install numpy ipython

To test, type ipython to enter an IPython terminal, and type import numpy. If no error pops up, then you've successfully installed both of these packages.

Conda also handles virtual environments. For example, you may come across projects that require the use of Python 2.7. You can create a new environment named "py27" that uses Python 2.7 by typing:

conda create -n py27 python=2.7

Now, whenever you want to use this version of Python, type source activate py27. Your terminal line should now be prepended with (py27). While in this "virtual environment", installing packages (e.g. conda install numpy) installs into py27 alone, so you can install fixed versions without affecting your entire system. Finally, to deactivate and go back to your default Python setup, type source deactivate.

This enables you to easily switch between Python 2 and Python 3. You can freely add more environments, using each one to "freeze" your package versions without affecting your overall system.

Conda is a package manager designed specifically for scientific coders, but there is a more prevalent python package manager called "pip". As you start working with python, you will see a lot of commands along the lines of pip install x. Fortunately, we can use pip and conda simultaneously by installing pip through conda.

conda install pip

Now, you can use pip to install packages outside of conda's options. Note that if you installed pip through any other method, you should uninstall it before installing through conda.

#Operating Systems

####Windows

For novice programmers, Windows is a fine operating system. It can install conda, and you can run python scripts easily. Text editors such as notepad++ handle some complexities of how Windows handles text, and can be used to create scripts that should run on any operating system.

As you progress in your coding ability, Windows will begin to limit you. There is no supported package manager, and weird convention differences between Windows and Mac/Linux will hamper the use of more complex programming techniques. When this happens, I recommend installing an operating system such as Ubuntu alongside Windows and using that installation for programming. There are some other solutions (e.g. Powershell and remote coding), but I advise against trying to force Windows into being something it's not.

####Mac

Mac laptops are great to use for writing and testing code before sending to Linux servers for running. Personally, I think they're the best of both worlds for a laptop, as they can do both MS Office and programming.

Your mac does not come with an operating system package manager by default. However, the homebrew package manager fixes this. You can get homebrew with:

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

and you can install operating system packages with commands like brew install gcc.

####Linux

Linux computers are the overwhelming standard for servers (e.g. supercomputers), and the operating system is especially useful when installed on a powerful server and used as a workhorse.

Linux systems come with their own package managers by default. The most popular linux distribution (or "distro") today is ubuntu, which can install packages with commands like sudo apt-get install g++.

The apt-get command works on what are called "Debian" distros, which include ubuntu. A competing distro is fedora, which is a "Red Hat" distro. Red Hat distros use the yum command to the same effect. That being said, the programming community has accepted ubuntu as the flavor of choice, and you now generally only see non-debian distros in older machines.

By default, these package managers need root - or "sudo" - access. This is okay on computers you own, but it is a hassle on supercomputers. There are solutions, however. As discussed here, you can create a virtual environment (schroot) to get around the issue. Even though setting up schroot may be a pain, resist the temptation to install packages from source. You only need to set up a virtual environment once, but you'll need to build every package from source if you don't figure out how to use a package manager on your cluster.

From here, check out the installation page.

Clone this wiki locally