# Getting Started with Python, Jupyter and Git


## 0. Learn the Very Basics
[Code Academy's Python course](https://www.codecademy.com/catalog/language/python) is an excellent way to start getting to grips with Python.


## 1. Install All the Things

### Python / Jupyter:
Download and install the Python 3.6 version of [Anaconda](https://www.anaconda.com/download/).

Anaconda is a version of Python that comes pre-installed with most of the packages you'll need for data science. Every package in Anaconda is curated by the folks at Continuum Analytics, to make sure everything works well with each other. This means that they often don't have the very-latest versions of packages... but that rarely matters.

#### Why Python 3?
Short answer: Python 2 will stop being developed in ~2 years, so if you're starting fresh there's no point using it.

For most practical purposes, they're idential apart from a few minor syntax tweaks. The only reason you might want to run Python 2 is if there's a particular package you want to use that hasn't migrated to Python 3 yet. I've never encountered this as a problem, and everything is now migrating to Python 3... so I'd be surprised if you ever need to do this.

###  Git
Download and instal [Git](https://git-scm.com/).

Optionally (recommended if you're new to this), download and install a [Git Graphical User Interface](https://git-scm.com/downloads/guis/) - a friendly point-and-click intro to the world of version control! I've had good experiences with both [GitKraken](https://www.gitkraken.com/) and [SourceTree](https://www.sourcetreeapp.com/).


## 2. Running Jupyter
Now Python is installed, you can start using it! My favourite Python interface is [Jupyter](http://jupyter.org) - a browser-based interface for interactively writing and running code, taking notes and making plots. To start Jupyter, you can either use the Anaconda Navigator, or use the terminal (Anaonda Prompt, in Windows).

Open a Terminal (Anaconda Prompt, in Windows), navigate to the directory you want to work in (using `cd`), and run:

    jupyter notebook

This should magically open a browser window with Jupyter running. Create new 'notebooks' using the menu in the top right.

Before you go ahead and Python all the things, there are a few more things to think about...


## 3. Managing Python
Python is a relatively basic programming language. It can't do that much 'out of the box' (unlike, e.g. Matlab, which comes 'batteries included'). This is because Python can be used for a vast array of applications, from controlling robots, to managing mailing lists, making websites, data science, and much more. If it came with all this 'in the box', it would be unfeasibly massive.

To do useful work in Python, you rely on 'packages'. These are extensions to Python (sets of functions and classes) which allow you to do what you actually want to do. To use packages in python, you have to import them, using (e.g.):

    import numpy as np

This will import the **Nu**merical **Py**thon package, and make all of its functions available via the `np` operator (type `np.[Tab]` to see a list of functions).

A nice thing about packages is that if you can think of a task, someone's probably written a package to do it. The irritating thing about this is that often *several* people have written packages to do it, and it can be difficult to work out which is best. Two useful tools for this are the [Python Package Index](https://pypi.python.org/pypi) and [GitHub](https://github.com/). You can look up packages here, and find out how 'active' the developers are - i.e. is there a big team of people working on developing and maintaining the package, or has it been abandoned for the last 3 years? If the latter, it's unlikely to be a good choice.

If you've installed Anaconda (above), it comes bundled with [most of the common packages for data analysis](https://docs.anaconda.com/anaconda/packages/py3.6_linux-64). But what if you want to install a package that isn't included?

Before we get to this, there's an important caveat...

### Virtual Environments
You don't *need* to use virtual environments to use Python, but they're really useful, and you should.

Virtual Environments allow you to have multiple versions of Python and packages installed on your computer simultaneously. For example, you could have both Python 2 and Python 3 installed side by side in separate virtual environments. A virtual environment is basically just a folder containing a complete Python installation. When running Python, you can choose which virtual environmennt you want to run.

Anaconda comes with a few handy tools for managing virtual environments. You can do this via the Anaconda Navigator interface, or via a terminal. A few handy terminal (Anaconda Prompt, in Windows) commands:

Make a new environment:

    conda create -n yourenvironmentname python=x.x anaconda

replace `yourenvironmentname` with what you want to call it, and `x.x` with the python version you want. `anaconda` at the end tells it to install all the 'basic' anaconda packages in your new environment. You can leave this off if you want *only* python with no extra packages installed.

'Activate' an environment:

    source activate yourenvironmentname

This tells your computer that you want to use a particular environment. Once you've run this command in your terminal, everything you do in Python will use the Python installation within `yourenvironmentname`. If this has worked, you should see `(yourenvironmentname)` to the left of each line in your terminal.

'Deactivate' an environment:

    source deactivate

This tells the computer you've finished with that environment. Alternatively, you can just close the terminal window.

Remove an environment:

    conda remove -n yourenvironmentname -all

Deletes the virtual environment, and all it's contents.


### Installing Packages
So you've got Anaconda installed, are happily working in your virtual environment, and need to use a package that isn't included in Anaconda...

For example, the [`obspy`](https://github.com/obspy/obspy/wiki) seismology package. Before you install anything, make sure you've activated (`source activate environmentname`) the virtual environment you want to install the package in.

#### First stop `conda`
The first thing to try is Anaconda's built-in package manager `conda`. This looks at Anaconda's curated list of packages that it *knows* work together, and installs in from there. To do this, run:

    conda install obspy

in a Terminal (Anaconda Promt, in Windows).

In this case, `obspy` is not available in the Anaconda list of packages, and you'll get an error. The next things to try is:

#### If that fails, `pip`
The Python Package Index installer. This looks for packages on the community-maintained package lists. They haven't been checked by the folks at Anaconda, so *may* not work properly on your system, although in practice this is almost never a problem with actively maintained packages. To use `pip`, run:

    pip install obspy

This will download and install the package, and you can immediately import it into your Python session.


## 4. Git

### What is it?
Version control.

**Benefits:**

- Multiple collaborators can work on a project, keep track of who's done what, and avoid conflicting changes.
- No more 'Save As: my_amazing_code_2017-10-12-v102993.py' - versioning system keeps track of all changes made, and changes can be 'rolled back' later without losing them.
- Can 'clone' or 'branch' other people's code, and run/modify it yourself.

### Basics
Read [this](https://git-scm.com/book/en/v2/Getting-Started-Git-Basics).

If you want to commit changes to online repositories, you'll need a [GitHub Account](https://github.com/join).

### Making a new Local Repository

Git Doesn't *have* to interact with the internet. You can set up a version-control repository locally (on your computer), and do everything offline. To do this:

#### In a Git Client
Find the option to create or 'init' a new repository. Tell it which folder you want to use, make sure you've selected/deselected the option that makes it 'local only', and press 'go!'.

#### In a Terminal (/Anagonda Prompt)
Open a new terminal, `cd` the folder you want to set up the repository in, and run:

    git init

### Getting Data Surgery Material

#### In a Git Client
In your favourite git client, find the option to 'Clone Repository'. Specify a folder to save everything to, and in the 'Repository URL' field, put `https://github.com/rses-datascience/DataSurgeries`.

#### In a Terminal (/Anagonda Prompt)
Open a new terminal, `cd` to the place you want to save the repository and run:

    git clone https://github.com/rses-datascience/DataSurgeries