# Introduction to Programming

What a title, huh?  "Programming" is such a hugely expansive term, covering a staggering number of techniques, languages, processes, hardware configurations, purposes, applications, scales, and so forth.

*So what is programming, really?*

The short version is that programming is a way of making a computer (another broad term!) perform tasks.  These tasks can be as simple as the addition of two numbers, or as complex as calculating the excited state energy of a fluorescent nucleotide as it moves through a solution of water and sodium chloride ions.  As the tasks become more complex, the code becomes more complex as well.  Small tasks may be manageable with tiny scripts or single-file programs, while larger ones may require the inclusion of other tools, multiple files, and more complicated design principles.

For the purposes of this Summer School, we'll mostly be sticking to the introductory/entry-level stuff, but remember that the actual limits of your programs are only in the capacity of your hardware and the breadth of your imagination.

## Fundamental Principles

There are some rules to programming that should really be learned early on, if only because knowing them early will make the rest considerably easier.  For many experienced programmers, the actual writing of code is a smaller portion of the overall process than one might expect.

**Version Control** - This is a very important habit to get into.  In the simplest form, version control provides a means of keeping track of the changes you've made to your code as you go, as well as providing information about who made which changes in a collaborative project.  More complicated version control can lead to things like software written for different types of hardware, or for different scales of calculations, etc.

**Code Formatting** - Many different groups and companies have what are known as "style guides" for any code produced in, by, or for the organization.  These often include the simple concepts like "how many spaces constitute a `tab` in your code?" or "What information, if any, should be included in comments at the top of each file?".  However, these style guides can also include more complex information beyond text formatting and up into things like "Each function definition should include comments describing the arguments to the function and what the function returns on completion" or "Test suites must be included for all additions to the codebase before they may be considered for merging".

**Debugging** - This is a team effort, because invariably the person who wrote the code might wind up missing some of their own mistakes that others will readily find.  This is not at all an indicator of skill, intelligence, or character.  This is purely because we can get tunnel vision about our own code (it happens to the best of us), and because our brains are wired to recognize *and complete* patterns.  Where we might "see" a missing semicolon at the end of a line of code we wrote because we expect it to be there, someone else who didn't write our code may notice it immediately.  Interestingly, debugging is actually a *highly* valued skill in industry, because fixing problems is usually far more expensive than preventing them in the first place.

**Pseudocode** - Other names for pseudocode include "algorithm development", "project management", "outlining", "planning", "thinking", and "jotting that down so I don't forget it later".  Pseudocoding is effectively writing out the stages of your program in plain language (*not code*) to ensure a clear understanding of the problem you're trying to solve.  Often, programmers will begin pseudocode with a very simple set of steps they think of the problem.  Each step can be explained in more and more detail as a set of smaller, more manageable steps, until eventually you wind up with a complete list of steps that can be converted to computer commands.  Pseudocode can also provide some insight into ways the code might be optimized, such as by revealing opportunities to parallelise the execution, or by revealing regions where something being calculated can just be saved for reuse rather than recalculated again later.

*If only some of that made sense, you're in the right place.*


## Version Control

The most commonly used tool for version control is `git`, with related websites [GitHub](https://www.github.com), [GitLab](https://about.gitlab.com/), and [Bitbucket](https://bitbucket.org/).  For the purposes of this workshop, we'll focus on using GitHub because it's free and most (if not all) of us already have accounts there.

There are a few steps to do first if you've never used `git` on your current computer before.  These will configure your computer for use with your GitHub account.  If you use multiple computers (including working from a HPC/Supercomputer), these steps will need to be configured for each computer you use.

---

Configure your local machine with an SSH-key.  This will allow your computer to connect to other **trusted** computers that you've previously designated as such, and this includes your Github account.

```bash
    cd $HOME
    ssh-keygen
```
will give the following response/prompt:
```
    Generating public/private rsa key pair
    Enter file in which to save the key (/home/username/.ssh/id_rsa): 
```
If the file already exists, choose a new filename such as `git_rsa` or something you'll recognize.

```
    Enter passphrase (empty for no passphrase): 
    Enter same passphrase again: 
```
You can set a passphrase if you want, but keep in mind you'll be entering it every single time you upload changes of your code to GitHub.  Some people choose not to have a passphrase for this particular aspect of their work, others do.

```
    Your identification has been saved in /home/username/.ssh/git_rsa
    Your public key has been saved in /home/username/.ssh/git_rsa.pub
    The key fingerprint is:
    SHA256:8U9t+r+SwCi8Xe8uu3HCjbHa7WU51A9pArzm9+F+esk username@Computer
    The key's randomart image is:
    +---[RSA 3072]----+
    |                 |
    |       .         |
    |        +  . +   |
    |         =. =. ..|
    |      . S =*o.=..|
    |       = .oBo+..o|
    |        ..o+*o.*.|
    |       . o.+o*D .|
    |           +@Bo*o|
    +----[SHA256]-----+
```
(As an aside, this is a modified variant of the one generated for this workshop, so don't bother trying to mess with my stuff.)

---

Add the SSH key to your GitHub Account to allow your computer to access it.

Go to your account settings page and click on `SSH and GPG keys`, then click on `New SSH key`.

![GitHub Step 1](Images/GH_01.png)

![GitHub Step 2](Images/GH_02.png)

![GitHub Step 3](Images/GH_03.png)


You'll need some text out of a file generated by the `ssh-keygen` step.  If you named the keyfile `git_rsa`, the process will have also produced a file called `git_rsa.pub`, which is the "public key" corresponding to your computer's private key.  In simpler terms, the public key is like a "secret question" that the other computer can ask, that only your computer with its private "secret answer" can properly respond to, so both computers know the other is trusted with this information transfer.  

Open the `git_rsa.pub` file and copy all the text into the field shown on the GitHub website here.

![GitHub Step 4](Images/GH_04.png)

---

You'll also need to configure `git` on your computer as well.  Assuming you have `git` already installed, you can begin with setting some of the initial variables.

You can configure individual repositories (projects) with these settings, or you can configure `git` globally to set your defaults.  For now, we'll assume that you only have one GitHub account to manage on your computer.

```bash
git config --global user.name "Firstname Lastname"
git config --global user.email "username@emailserver.com"

```