# An environment for everyone: Codespaces devcontainers


## 🚀🚀🚀 
## TLDR;

You can run a codespace without a devcontainer, but use devcontainers to set up VSCode in Codespaces with all the things you need, like an OS, Python, and your requirements. The codespace user doesn't have to install anything, or `run` and `build` Docker. You just open the codespace and are ready to code. 

Minimal requirements to run a devcontainer are just `/.devcontainer`, a dockerfile and a `devcontainer.json`.

But here are all the components we'll cover: 

### A `/.devcontainer` 🫙 with:

1) a **dockerfile 🐳**: instructions of OS and Python and reuirements.txt to use in the Codespace
2) a **`devcontainer.json`** that says to use the Dockerfile, and sets some other stuff up
3) Optionally, additional scripts as `RUN` in Dockerfile or as **lifecycle scripts** using `Command` family of properties

### As needed, secret/key access 🔑 
Configured on github.com Settings.

### Maintenance tools 
Like a playbook and unit tests for codespace-related files

### Optionally: 
- 🧪 **Tests** that codespace was built correctly 
- ⚙️ **VSCode settings** to, e.g., include certain extensions
- 🏗 **Prebuild**: makes building codespace faster



<br>

<br>

< end TLDR; >

## 🚀🚀🚀

<br>
<br>


# `/.devcontainer` 🫙

Examples: [base](https://github.com/microsoft/vscode-dev-containers/blob/v0.222.0/containers/python-3/.devcontainer/base.Dockerfile), [security-advisory-filtering](https://github.com/github/security-advisory-filtering/commit/3938d544b471ee1f81e7e663ee14fd8714c90ac7), [airflow](https://github.com/github/airflow-sources/tree/master/.devcontainer), [actions-aml](https://github.com/github/actions-aml/blob/main/.devcontainer/Dockerfile); [teaching template](https://github.com/education/codespaces-teaching-template-py/blob/main/.devcontainer/Dockerfile)



## `/.devcontainer` 🫙 component 1: dockerfile 🐳


Docker isolates not just the Python `site-packages`, but also the OS and the version of Python. 

[harnesslib one](https://github.com/github/harnesslib/blob/main/Dockerfile.harnesslib)


### Dockerfiles `build` into `image`s (templates), which `run` as `container`s 

First, you write instructions what you want to install: usually an OS (you can even run Windows on a mac!), dependencies, and your project. This is called a _dockerfile_. 🐳 📁 

Then you turn those instructions into a set of binary executable files: using `docker build` to _build_ the _dockerfile_ into an _image_ 🖼️.
- 💬 "image" is a metaphor for the idea of these executables being like a snapshot. Let's think of it more like an _environment template_.

When you `docker run` an _image_ it creates a _container_ -- so a _container_  🫙 is a running _image_.
- Why call it "container" instead of "environment"? Because environments are more complex; e.g., it also includes your hardware and networking. Even the containerized OS is a simplified one. So consider a container a type of environment.

Generating and running containers is all done by the Docker _engine_/"daemon"
- 💬 "daemon" is from mythology of a guardian entity. Let's prefer "engine".


<img width="621" alt="image" src="https://user-images.githubusercontent.com/38010821/227788568-c28acc56-6c71-4cad-93b0-a0b14021546f.png">

### Codespaces devcontainers automatically install & run the Docker engine, do `docker build` (dockerfile --> image) and `docker run` (image --> container)! 

### 📄 So all you need is to write dockerfile, like [this annotated example.](https://github.com/lizre/learn-py/blob/master/.devcontainer/Dockerfile) 


### Set a non-root user in your dockerfile or devcontainer

Every process on a computer is associated with a user account and its permissions to do actions and access resources.

The _root user_ has full permissions.
 
When a process is started, it has the permissions of the account that started it. 

☢️ So if you run a container as a root user, the container is now root user can do whatever it wants. Like access your personal documents.

Docker containers default to running as the _root user_.

But Codespaces defaults to a _non-root_ user.

👍 Still, it's considered best to **explicitly `useradd` a non-root user to your Dockerfile.** (but I can't figure out why. 🤔)

Alternatively, some examples add `"remoteUser": "vscode"` to the `devcontainer.json`. This is the default user that will be used when running the container. If not provided, the default user specified in the Dockerfile will be used.
Most examples seem to either do "useradd" in the Dockerfile, OR this remoteUser thing here, not both. There doesn't seem a clear advantage to which one.



# TODO: venv in docker file

- https://www.youtube.com/watch?v=qLvAHhJAVlI&list=PLmsFUfdnGr3wTl-NCblzcrEv2lFSX975-&index=15

examples that use venv in the docker file
- https://github.com/education/codespaces-teaching-template-py/blob/main/.devcontainer/Dockerfile
- https://github.com/microsoft/vscode-dev-containers/blob/v0.222.0/containers/python-3/.devcontainer/base.Dockerfile
    





## `/devcontainer` 🫙 component 2: devcontainer.json

`devcontainer.json` configures a codespace. It tells VS Code how to build and run the Docker container, using the Dockerfile as a template.

All Codespaces have a configuration. If you create one without a `devcontainer.json` file, Codespaces uses a [default configuration](https://docs.github.com/en/codespaces/setting-up-your-project-for-codespaces/adding-a-dev-container-configuration/introduction-to-dev-containers#using-the-default-dev-container-configuration)

You can define multiple configurations.

### Write a `devcontainer.json` like [this annotated example](https://github.com/lizre/learn-py/blob/master/.devcontainer/devcontainer.json)

You need to refer to the dockerfile in the `devcontainer.json`, e.g., `"dockerfile": "Dockerfile"`. This means you can change the path to the Dockerfile. For example, if you want the Dockerfile to be in the root directory of the project instead of in `/.devcontainer`,, you can do  `"dockerfile": "../Dockerfile"`.

# With just the `Dockerfile` and `devcontainer.json`, you can run your devcontainer!

Put them in .devcontainer like [this](https://github.com/lizre/learn-py/tree/master/.devcontainer):

![image](https://user-images.githubusercontent.com/38010821/228025282-e68bf0fe-feb1-4a6b-ad3b-475dc1713a63.png)


Open a codespace. It'll say you're on a "custom image":

![image](https://user-images.githubusercontent.com/38010821/228026931-f331aa76-134b-423c-a236-6e5d9ff5100b.png)

And `pip list` will have the stuff from your `requirements.txt`:

![image](https://user-images.githubusercontent.com/38010821/228028817-92e3123c-0a97-4940-add1-e2d72e5e7d08.png)

In addition to some other stuff, maybe that comes with our base image.

### devcontainer `Command`s as alternatives to Dockerfile `RUN`

Sometimes you need to run commands as part of configuring your devcontainer.

You can put that in your `devcontainer.json` as a `*Command`, e.g.:

`, "postCreateCommand": "pip3 install --user -r requirements.txt",`

Remember that our Dockerfile already had

```
COPY requirements.txt /tmp/pip-tmp/
RUN pip3 --disable-pip-version-check --no-cache-dir install -r /tmp/pip-tmp/requirements.txt \
    && rm -rf /tmp/pip-tmp
```

So this illustrates that some things, you can choose to do either as `RUN` in the Dockerfile, OR as a `postCreateCommand`.

There is a whole family of `*Command` scripts, known as [lifecycle scripts](https://containers.dev/implementors/json_reference/#_lifecycle-scripts).
- e.g, `onCreateCommand` executes inside the container


#### If the commands are long, you can also do them as scripts (`.sh` shell and/or python scripts)

[Example 1: python script used in devcontainer onCreateCommand:](https://github.com/lizre/learn-py/blob/e7121782a94ae52a5a2eef1373818114da08d45e/.devcontainer/devcontainer.json#L40)

- So you'd make a script you want to run, like `test_codespace.py`, and then do something like 

    "onCreateCommand": ".devcontainer/test_codespace.py",

Example 2: shell script used in devcontainer onCreateCommand:
- https://github.com/github/airflow-sources/blob/master/.devcontainer/on-create-command.sh
- https://github.com/github/airflow-sources/blob/777d30ba67f325b5fa72e8ded5f04fb70578362c/.devcontainer/devcontainer.json#LL74-L75



I think you can also do these in Dockerfile `RUN`.

### ☢️ If a `Command` does not succeed, the container will not be built, and you'll get a build error.

Pros
- Runs automatically so user doesn't have to run `python .devcontainer/test_codespace.py` once they get into the codespace
- Can protect users from using a bad codespace

Cons
- if fails, may be hard for them to troubleshoot because they'll have to read build logs.

# Tests that codespace was set up correctly 🧪

## Things you might want to test

e.g., that pip is installed in the codespace.

## Ways to test them

### 1) Run each thing manually when developing devcontainer and/or when users build codespace

e.g., open the codespace and run "which pip".

### 2) Put tests in a [`.devcontainer/test_codespace.py`](https://github.com/lizre/learn-py/blob/master/.devcontainer/test_codespace.py), which user runs when they build the codespace

![image](https://user-images.githubusercontent.com/38010821/228383939-ed73b6c0-79d0-40dd-9f4d-6e3a1969ed36.png)


### 3) Automatically on every codespace creation, with on-create-command.sh? This doesnt't seem common but could be cool


[example](https://github.com/github/airflow-sources/commit/fb89004ac8c3848ff3062841ccea592adf9cec57)
- Add to `devcontainer.json`: `"onCreateCommand": ".devcontainer/on-create-command.sh",`
- Make an [on-create-command.sh](https://github.com/github/airflow-sources/blob/master/.devcontainer/on-create-command.sh)

But broke the codespace:

2023-03-28 17:46:16.792Z: /bin/sh: 1: .devcontainer/on-create-command.sh: Permission denied

and this

#16 0.472 chmod: cannot access '.devcontainer/on-create-command.sh': No such file or directory

potentially related:
- https://github.com/microsoft/vscode-remote-release/issues/5432
- https://stackoverflow.com/questions/38882654/docker-entrypoint-running-bash-script-gets-permission-denied

You can do "onCreateCommand": "python3 .devcontainer/test_codespace.py"
	But then the container fails w error 1302 in logs, without saying which test fails.
	What about instead of asserting, just print the result? No, then it just does nothing, does not print.

And postcreatecommand simply does not run it.
	I tested by making one of the tests impossible to pass, and the codespace doesnt raise any error.
But when i do python3 .devcontainer/test_codespace.py in the codespace, it returns the error.


### As print instead of assert


```
def check_key_characteristics():
    """
    Checks that our API key object has characteristics of azure keys.
    """
    api_key = os.environ['AIP_API_KEY_TEMPORARY']
    if api_key:
        print("AIP_API_KEY_TEMPORARY exists. Good!")
    else:
        print("AIP_API_KEY_TEMPORARY does not exist. You need a key.")
    
    if api_key.isalnum():
        print("AIP_API_KEY_TEMPORARY is alphanumeric. Good!")
    else: 
        print("AIP_API_KEY_TEMPORARY is not alphanumeric, and it should be")

    if len(api_key) == 32:
        print("AIP_API_KEY_TEMPORARY is 32 characters long. Good!")
    else: 
        print("AIP_API_KEY_TEMPORARY is not 32 characters long, and it should be")

check_key_characteristics()
```

# As needed: Authentication 🔑

Sometimes in Codespaces you need to access sevices or resources outside the Codespace. Like Azure storage. To do that, you need to show that service or resource that you/the Codespace user are who you say you are. That's called _authentication_.

### authentication: verifying identity

Resources: [1](https://cloud.google.com/docs/authentication), [2](https://zapier.com/learn/apis/chapter-4-authentication-part-1/)

Different from authorization (permission to do things). 

_Principal_: an identity that can be granted access.
- users, services, apps. Your Codespace!

_Secret_: anything that you want to control access to. eg API keys, passwords.

_Credentials_: any info used to authenticate

_Password_ 
- is credential
- user-generated, stored in human memory, manually repeated with each use, and usually 1:1 with a human

_Key_
- is credential
- generated by API, usually used programmatically (in code, and used for non-human services)
- No standard way to include; sometimes add to URL, or put in request body, or auth header instead of username and pw.
- Oauth: automates key exchange so you dont have to type it out


### authenticate in a codespace by adding secrets


Go to https://github.com/lizre/learn-py/settings/secrets/codespaces, then "New repository secret"


<img width="681" alt="image" src="https://user-images.githubusercontent.com/38010821/228086774-243d5615-f982-4a7f-b4f4-bce41c804526.png">

<br>

Then it's there: 

<br>

<img width="621" alt="image" src="https://user-images.githubusercontent.com/38010821/228086807-47e63196-ceb0-4510-b01d-01871a795865.png">


Then it's available in your Codespace!:

![image](https://user-images.githubusercontent.com/38010821/228087171-7b00d845-7136-403c-ba8f-cd188a094025.png)



#### ☢️ GOTCHA: Codespaces silently converts secret names to all capital letters. 
If you're already using a secret name like `Not_so_secret` in your code, you'll need to change all instances to `NOT_SO_SECRET`!

You can access it in jupyter notebook with `os.environ.get('NOT_SO_SECRET')`.

Also see this proposal for [personal secrets in codespaces](https://github.com/github/engineering/discussions/3008).

# Optional: VSCode settings ⚙️
    
VSCode settings are stored in a json:

In [3]:
!cat ~/Library/Application\ Support/Code/User/settings.json

{
    "[python]": {
        "editor.defaultFormatter": null,
        "editor.formatOnSave": true,
    },
    "python.testing.unittestEnabled": false,
    "python.testing.pytestEnabled": true,
    "python.defaultInterpreterPath": "python",
}

You can put similar json in your repo in `/.vscode/settings.json`: https://github.com/lizre/learn-py/tree/master/.vscode:

In [19]:
!cat ~/Downloads/learn-py/.vscode/settings.json

{
    "editor.fontSize": 35,
    "python.testing.pytestEnabled": true,
    "python.defaultInterpreterPath": "python",
}

See how I made the font size huge, 35!

Now put it in the repo:

            
<img width="664" alt="image" src="https://user-images.githubusercontent.com/38010821/228092487-c4825b03-2dd1-4a69-a553-5b74f1b57cc1.png">


Then when you build or rebuild the Codespace, it'll apply my huge font size:

<img width="722" alt="image" src="https://user-images.githubusercontent.com/38010821/228093857-a30a42e3-edb6-477a-94f8-7d550ead11f1.png">


In [None]:
Optional:

# Optional: prebuilds 🏗️

https://docs.github.com/en/codespaces/prebuilding-your-codespaces/about-github-codespaces-prebuilds

https://docs.github.com/en/codespaces/prebuilding-your-codespaces/configuring-prebuilds

TLDR; makes it faster to build a new codespace.

lower priority


# Codespace maintenance and changes

### Write a maintenance playbook like [this one](https://github.com/lizre/learn-py/tree/master#codespace-playbook)



# TODO:  



### Write tests that apply to codespace-related files

eg, this found errors in my test_codespace.py:
https://github.com/github/harnesslib/blob/569f894b77bafa2c762eec2948cb44907be1983d/.github/workflows/notebook-integration-test.yaml

https://github.com/github/harnesslib/blob/569f894b77bafa2c762eec2948cb44907be1983d/.github/workflows/simple-batch-integration-test.yaml


Devcontainer updates when you create a codespace or rebuild the container. Use VS Code Command Palette (Shift+Command+P) --> `Codespaces: Rebuild Container`.

[each push to a branch that has a prebuild configuration results in a GitHub-managed GitHub Actions workflow run to update the prebuild.](https://docs.github.com/en/codespaces/prebuilding-your-codespaces/about-github-codespaces-prebuilds#about-pushing-changes-to-prebuild-enabled-branches)

Keeping keys updated
- eg https://github.com/github/harnesslib/issues/229



# You can clone submodules in devcontainers using one of two options

## 1. A `postCreateCommand` in `devcontainer.json`

`"postCreateCommand": "git submodule update --init"`

## 2. A shell script like

```
#!/bin/bash

git config --global advice.detachedHead false

    # checking out a particular commit of each grammar below leaves us in detached head, but
    # codespace users don't need that surfaced.

# Set the root directory of the project
root=$(dirname $(dirname $(realpath $0)))

# Clone the tree-sitter-<language> repositories (the "grammar"/syntax rules) into the tree-sitter directory
# We have to clone it to be able to use Language.build_library to build it

tree_sitter_dir="$root/harnesslib/external/tree-sitter"

cd $tree_sitter_dir
git clone https://github.com/tree-sitter/tree-sitter-python.git
cd tree-sitter-python
git fetch --all --tags
git checkout de221eccf9a221f5b85474a553474a69b4b5784d

cd $tree_sitter_dir
git clone https://github.com/tree-sitter/tree-sitter-ruby.git
cd tree-sitter-ruby
git fetch --all --tags
git checkout c91960320d0f337bdd48308a8ad5500bd2616979


# Turn the python "grammar"/syntax rules into a parser. 
# This is called "build"ing the language, so we'll put the parser in a /build directory.
grammar_dir_python="$tree_sitter_dir/tree-sitter-python"
grammar_dir_ruby="$tree_sitter_dir/tree-sitter-ruby"


python3 -c "
import os;
from tree_sitter import Language;
Language.build_library(
    os.path.join('$tree_sitter_dir/build', 'my-languages.so'),
    ['$grammar_dir_python'
    , '$grammar_dir_ruby'
    ]\
)"
```


# Keyboard shortcuts for jupyter in codespaces

You have to change it in multiple places:
    
![image](https://user-images.githubusercontent.com/38010821/229866344-4e92d4da-e032-44b1-a0ca-f069f7418aee.png)

![image](https://user-images.githubusercontent.com/38010821/229866442-130312ad-366b-4831-948a-7a9433228987.png)



# Using codespaces

- your paths will be different in codespaces, like i think the home directory is `workspaces/<reponame>` 
- you can change a codespaces type while its running, at https://github.com/codespaces. 
- `developer: show running extensions` shows if jupyter is active. you can also go to Output and selectjupyer. 
- to upload files / copy them in : right click a folder, then Upload. Upload does NOT appear if you right-click a file.


### Use codespace environment variables in development

```
def is_codespace():
    """
    Detects whether we are running in a Codespace. See:
    https://docs.github.com/en/codespaces/developing-in-codespaces/default-environment-variables-for-your-codespace
    """
    CODESPACE_ENV = os.getenv('CODESPACES')
    if CODESPACE_ENV is None:
        return False
    else:
        return CODESPACE_ENV.lower() == 'true'
        
        
def get_codespace_username():
    """
    Returns the Git username of the user that started the Codespace. See:
    https://docs.github.com/en/codespaces/developing-in-codespaces/default-environment-variables-for-your-codespace
    """
    return os.getenv('GITHUB_USER')
```