# Instructions for installing the gds4eco environment

### Key concepts

*Python*: An open-source interpreted programming language that is extremely popular for data science and geography

*Module*: An implementation of functionality that doesn't exist in basic Python, similar to R's packages. For instance, `numpy` implements fast numerical and array computations.

*Dependency:* Often, modules depend on other modules. For instance, `pandas` implements dataframes in Python, and depends on `numpy`. `geopandas` adds a geographic layer to `pandas`, so it depends on `pandas` and itself on `numpy`.

*Environment* : A Python installation with all the necessary modules to accomplish a desired functionality, where specific versions of each depndency are chosen to make everything compatible

*Conda*: An open-source environment manager for Python. Does the work in the background to install environments where all dependencies are compatible between them

*Anaconda*: the most popular implementation of Conda

*Jupyter* = open-source software project developping services for interactive computing across different programming languages (core: **Ju**lia, **Py**thon, **R**)

*Jupyter Notebook* = web-based application you can run on your browser or in VSCode to create "notebooks" - single files that contain everything you need in a data workflow (more on this later)

*VSCode:* the most popular code editor. We will be using it to work on our notebooks and run the code in Anaconda Python.

[*JupyterLab*](https://jupyterlab.readthedocs.io/en/stable/) (what we are using here) = evolution of Jupyter Notebook

## Installing the gds4eco environment using Anaconda and a .yml file

We will do a native installation of our environment using Anaconda. The first step is naturally to install Anaconda. 

You should install the appropriate Anaconda version for your system from the [official webpage](https://www.anaconda.com/download). Once this is done, we will use the .yml file provided by Carmen Cabrera Arnau for the `gds4eco` course, which you can download from [this link in the github repository of this small course](https://raw.githubusercontent.com/TheLeache/gds4ae_practice_23/refs/heads/main/env/gds4eco.yml)

### .yaml files

These files contain the instructions for Anaconda to create your environment. It's basically a list of the dependencies you want to install, usually with the specific versions, to ensure the compatibility and replicability of the environment you are using. 

For instance, if you are running lots of Python code for a research paper, you would produce a .yml file that replicates the Python environment you used, and you would include it in your replication package. That way, the data editor in a journal could install the same Python environment you used in your project, and verify that the code produces the same results without running into errors. 

Another use would be if a team is producing Python software. Then, they would have to make sure they are using the same dependencies to not run into compatibility problems. They could use Anaconda to install the same environment across all the machines that the team is using. 

#### How to use the .yaml file:

If you have a .yaml file located at `path_to_explicit_file`, then you simply have to open the Anaconda Terminal and write: 

`conda env create --name env_name --file path_to_explicit_file.yml`

### Installing gds4eco:

You can run the above line, putting as `path_to_explicit_file` the URL above. Otherwise, you can save the content in that URL to a text file locally in your machine, and direct the Anaconda terminal to that path in your machine.

#### Anaconda terminal for Windows:

If you use Windows, you should type in the search bar "Anaconda Powershell Prompt" and run it. 

#### Anaconda terminal in Mac and Ubuntu: 

If you use a Unix sytem like Mac or Ubuntu, you should open your computer's Terminal. Once you have Anaconda installed, you can run conda commands from it. 

### You can install the new environment with a simple line: 

The line would be as such: 

`conda env create --name gds4ae --file https://raw.githubusercontent.com/TheLeache/gds4ae_practice_23/refs/heads/main/env/gds4eco.yml`

Otherwise, you can download the file locally to your machine, and run it from the local file. For instance, if I download it to `G:/My Drive/CEMFI/2020_PhD/gds4ae_practice_23/env/gds4eco.yml`, I would set Anaconda to the G: drive like so: 

`cd G:`

and then I would run the line:

`conda env create --name gds4ae --file "My Drive/CEMFI/2020_PhD/gds4ae_practice_23/env/gds4eco.yml"`

### You can check more information about package management [here in the official documentation](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)



