# Getting Started in Python

## Python and programming languages

Programming languages are a way to issue explicit instructions to a computer to perform operations—here, the operations will all be about doing economics!

Python usually ranks as the first or second most popular programming language in the world and is also considered to be one of the easiest to learn. It's a general purpose programming language, which means it can perform a wide range of tasks. This combination of being relatively easy to learn but extendable to many applications is why people say Python has a "low floor and a high ceiling". As a language, it is widely used across industry, academia, and the public sector, and is frequently taught in schools. It has been applied to create the [first image of a black hole](https://numpy.org/case-studies/blackhole-image/), perform [economic analysis](https://aeturrell.github.io/coding-for-economists), and it is [behind the large language models](https://github.com/openai/gpt-2) (like ChatGPT) that are revolutionising how we think about AI.

Programming languages come in versions, which can be quite different. At the time of writing, Python 3.11 was the most recent version released. But the "base" language isn't the only thing you'll need: some of the most important functionality of programming languages is provided by add-ons called packages or libraries, which themselves have versions.

The combination of the language and its version (eg Python 3.9), the packages and their versions (eg numpy 1.24), and the operating system the code is being run on (eg MacOS Catalina) is called the computational environment.

## Preliminaries for Python

To programme, you will need two things on your computer:

- an installation (or "distribution") of the programming language. (We recommend you install the "conda distribution" of Python; more on that in a moment.)

- an integrated development environment (IDE), which is a way to write and run your code. (We recommend Visual Studio Code; more on that soon.)

If you don't want to install anything directly on your computer, there are a couple of other ways you can code making use of free online services:

- [Google Colab](https://colab.research.google.com/), which provides Python notebooks online for free. (Notebooks mix text and code.)
- [GitHub Codespaces](https://github.com/features/codespaces), which offer an online version of Visual Studio Code on a remote machine that has Python already installed. Hours are billed but there's a generous free tier. Codespaces supports notebooks or scripts.

You may need to create an account if you use either of these services.

## Using Python on your computer

The instructions in the videos will tell you how to:

- install the [Anaconda, or "conda", distribution of Python](https://www.anaconda.com/products/distribution) on your computer; and
- install [Visual Studio Code](https://code.visualstudio.com/) as your development and analysis environment, and use it.

These are the two things you need to get going with Python!

*[How to install Python using the Anaconda distribution of Python](https://www.youtube.com/watch?v=ZWQwGR5ppnk)*

<iframe width="560" height="315" src="https://www.youtube.com/embed/ZWQwGR5ppnk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

*[How to install Visual Studio Code and use it to run Python code](https://www.youtube.com/watch?v=1kKTYsQdaPw)*

<iframe width="560" height="315" src="https://www.youtube.com/embed/1kKTYsQdaPw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Once you have Visual Studio Code installed and opened, navigate to the 'extensions' tab on the left hand side vertical bar of icons (it's the one that looks like 4 squares). You'll need to install the [Python extension for VS Code](https://marketplace.visualstudio.com/items?itemName=ms-python.python), which you can search for by using the text box within VS Code's extensions panel.

## Writing Your First Code

There are broadly two ways to code: via "Jupyter" notebooks, which mix text and code in different blocks, or via scripts, which are predominantly code. Notebooks tend to be used more for ad hoc analysis while scripts are usually used for building production code—but you can use whichever you prefer for the task at hand. The Python chapters of this book are all available to download as Jupyter Notebooks.

### 'Hello World' in a script

Now you will create and run your first code. If you get stuck, there's a more in-depth tutorial over at the [VS Code documentation](https://code.visualstudio.com/docs/python/python-tutorial).

Create a new folder for your work, open that folder with Visual Studio Code and create a new file, naming it `hello_world.py`. The file extension, `.py`, is very important as it implicitly tells Visual Studio Code that this is a Python script. In the Visual Studio Code editor, add a single line to the file:

```python
print('Hello World!')
```

Save the file.

If you named this file with the extension `.py` then VS Code will recognise that it is Python code and you should see the name and version of Python pop up in the blue bar at the bottom of your VS Code window. Make sure that the version of Python displayed here is the Anaconda version that you just installed. If you have a fresh install of Anaconda's distribution of Python, you'll probably see something like `Python 3.8 64-bit ('base': conda)`.

When you press save, you may get messages about installing extra packages or making Pylance your default language server; just go with VS Code's suggestions here, except the one about the terminal and conda, which you can say no to.

Alright, let's run some code. Select/highlight the `print('Hello world!')` text you typed in the file and right-click to bring up some options including 'Run Selection/Line in Terminal' and `Run Selection/Line in Interactive Window'. Because VS Code is a richly featured IDE, there are lots of options for how to run the file.

The interactive window is a convenient and flexible way to run code that you have open in a script or that you type directly into the interactive window code box. The interactive window will 'remember' any variables that have been assigned (for examples, code statements like `x = 5`), whether they came from running some lines in your script or from you typing them in directly.

To run the code in an interactive window, **right-click and select 'Run Selection/Line in Interactive Window'**. This should cause a new 'interactive' panel to appear within Visual Studio Code, and only the selected line will execute within it. At this point, you may see a message about Visual Studio Code's default behaviour when you press <kbd>Shift</kbd> + <kbd>Enter</kbd>; for this book, it's good to have <kbd>Shift</kbd> + <kbd>Enter</kbd> default to running a line in the interactive window. The box below has instructions for how to ensure this always happens.

Let's make more use of the interactive window. At the bottom of it, there is a box that says 'Type code here and press shift-enter to run'. Go ahead and type `print('Hello World!')` directly in there to achieve the same effect as running the line from your script. Also, any variables you run in the interactive window (from your script or directly by entering them in the box) will persist.

To see how variables persist, type `hello_string = 'Hello World!'` into the interactive window's code entry box and hit shift-enter. If you now type `hello_string` and hit shift+enter, you will see the contents of the variable you just created. You can also click the grid symbol at the top of the interactive window (between the stop symbol and the save file symbol); this is the variable explorer and will pop open a panel showing all of the variables you've created in this interactive session. You should see one called `hello_string` of type `str` with a value `Hello World!`.

This shows the two ways of working with the interactive window--running (segments) from a script, or writing code directly in the entry box.

### Your first notebook

The other way of writing code is in notebooks, usually Jupyter Notebooks. Notebooks mix code and text by having a series of "cells" that are either code or text. We'll introduce the basics here, but if you get stuck with this tutorial, there's a more in-depth Visual Studio Code and Jupyter tutorial available [here](https://code.visualstudio.com/docs/python/jupyter-support).

To get started with Jupyter Notebooks, you'll need to have a Python installation and to run `pip install jupyterlab` on your computer's command line (aka the terminal—more on this soon!).

Create a new file in Visual Studio Code and save it as `hello_world.ipynb`. Close the file and re-open it. The notebook interface should automatically load and you'll see options to create cells with plus signs labelled 'Code' and 'Markdown'. A cell is an independent chunk of either code or text. Text cells have markdown in them, a lightweight language for creating text outputs.

Try adding `print("hello world!")` to the first (code) cell and hitting the play symbol on the left-hand side of the cell. You will be prompted to select a "kernel", a version of Python on your system. It doesn't matter which you use.

Now add a markdown cell ("+ Markdown") and enter:

```markdown
# This is a title

## This is a subtitle

This notebook demonstrates printing 'hello world!' to screen.
```

Click the tick that appears at the top of this cell.

Now, for the next cell, choose code and write:

```python
print('another code cell')
```

To run the notebook, you can choose to run all cells (usually a double play button at the top of the notebook page) or just each cell at a time (a play button beside a cell). 'Running' a markdown cell will render the markdown in display mode; running a code cell will execute it and insert the output below. When you play the code cell, you should see the 'hello world!' message appear.

Jupyter Notebooks are versatile and popular for early exploration of ideas, especially in fields like data science. Jupyter Notebooks can easily be run in the cloud using a browser too (via Binder or Google Colab) without any prior installation. Although it's not got very much code in, the page you're reading now can be loaded into Google Colab as a Jupyter Notebook by clicking 'Colab' under the rocket icon at the top of the page.

You can try a Jupyter Notebook without installing anything online at [https://jupyter.org/try](https://jupyter.org/try). Click on Try Classic Notebook for a tutorial.

## Packages and How to Install Them

Packages (also called libraries) are key to extending the functionality of Python. The default installation of Anaconda comes with many (around 250) of the packages you'll need, but it won't be long before you'll need to install some extra ones. There are packages for geoscience, for building websites, for analysing genetic data, and, yes, of course, for economics. Packages are typically not written by the core maintainers of the Python language but by enthusiasts, firms, researchers, academics, all sorts! Because anyone can write packages, they vary widely in their quality and usefulness. There are some that are key for an economics workflow, though, and you'll be seeing them again and again.

Three Python packages are ubiquitous: **numpy**, **pandas**, and **maplotlib**. These respectively provide numerical, data analysis, and plotting functionality, respectively.

Python packages don't come built-in (by definition) so you need to install them (just once, like installing any other application), and then import them into your scripts (whenever you use them in a script). When you issue an install command for a specific package, it is automatically downloaded from the internet and installed in the appropriate place on your computer.

To install extra Python packages, you issue install commands to a text-based window called the "terminal".


### The Terminal in Brief: a prelude to installing packages

The *terminal* is also known as the *command line* and sometimes the *command prompt*. It was labelled 4 in the screenshot of Visual Studio Code from earlier in the chapter. The terminal is a text-based way to issue all kinds of commands to your computer (not just Python commands) and knowing a little bit about it is really useful for coding (and more) because managing packages, environments (which we haven't yet discussed), and version control (ditto) can all be done via the terminal. We'll come to these in due course, but for now, a little background on what the terminal is and what it does.

Firstly, everything you can do by clicking on icons to launch programmes on your computer, you can also do via the terminal, also known as the command line. For many programmes, a lot of their functionality can be accessed using the command line, and other programmes *only* have a command line interface (CLI), including some that are used for data science.

To open up the command line within Visual Studio Code, use the <kbd>⌃</kbd> + <kbd>\`</kbd> keyboard shortcut (Mac) or <kbd>ctrl</kbd> + <kbd>\`</kbd> (Windows/Linux), or click "View > Terminal".

Windows users may find it easiest to use the Anaconda Prompt as their terminal, at least for installing Python packages.

If you want to open up the command line independently of Visual Studio Code, search for "Terminal" on Mac and Linux, and "Anaconda Prompt" on Windows.

If you have installed the Anaconda distribution of Python, your terminal should look something like this as your 'command prompt':

```bash
(base) your-username@your-computer current-directory %
```

on Mac, and the same but with '%' replaced by '$' on linux, and (using the Anaconda Prompt)

```bash
(base) C:\Users\YourUsername>
```

on Windows. If you don't see the word `(base)` at the start of the line, you may need to type `conda activate` first.

The `(base)` part is saying that your current Python environment is the base one (later, we'll see how to add others for reproducibility and to isolate projects). Unfortunately, and confusingly, the commands that you can use in the terminal on Mac and Linux, on the one hand, and Windows, on the other, are different but many of the principles are the same.

### Installing Packages in the Terminal

To install extra Python packages in the correct Python version, you'll need to have conda "activated"--if you don't see the name of an environment, eg `(base)`, at the start of your terminal's line, use the `conda activate` command first. On Windows, this is usually the command prompt (available in the integrated Visual Studio Code terminal) or the Anaconda Command Prompt (available in the start menu).

Install packages on the command line by typing

```bash
conda install package-name
```

and hitting return, where `package-name` might be `pandas`. This will try to install a version of the package that is already optimised for your type of computer, and will automatically come with any dependencies (packages the package you're installing needs to run). The pre-built packages that are provided by Anaconda are convenient for a host of reasons. Anaconda provide pre-built versions of around 7,500 of the most popular packages.

However, there are over 330,000 Python packages on PyPI (the Python Package Index) so you may sometimes find one that is not covered by `conda install`. When there isn't a pre-built Anaconda version of a package available, the next thing to try is

```bash
pip install packagename
```

In true programming-humour style, pip is a recursive acronym that stands for 'pip install packages'. You can see what packages you have installed by entering `conda list` into the command line.

Here's a full example of the commands used to install the **pandas** package into the base environment (you may not need the first one):

```bash
your-username@your-computer current-directory % conda activate
(base) your-username@your-computer current-directory % conda install pandas
```

The key packages you'll need to install for this book are:

- **matplotlib**
- **pandas**
- **seaborn**
- **pingouin**
- **skimpy**

## Python Version Used in Examples

The Python walkthroughs in this book were created with Python and Operating System versions:

In [None]:
%load_ext watermark
%watermark