Installing Python on your computer
==================================

**Author:** Ulrich G. Wortmann



## Installing Python locally



Many computers come with Python pre-installed. However, it is best to install a private python, so that your experiments won't interfere with your operating system. The best-known python distribution is `Conda` but the installation is huge and takes forever. Here we will use [https://github.com/conda-forge/miniforge](https://github.com/conda-forge/miniforge), which only installs a basic system. 

1.  Select the installer for your OS, and download the miniforge distribution
2.  Next we need to use the `mamba` packet manager to install the python packages we need (do not use the `conda` manager, it is really slow!).
3.  Open a command prompt and type the following:

    mamba install matplotlib numpy pandas statsmodels
    mamba install scipy pathlib argparse spyder

4.  Some of these may already be installed, and if you get errors that something is missing, you may need to install the missing libraries with mamba



## Runnig Python code from the cmd-line (terminal)



-   **Windows:** Start menu -> Power Shell
-   **Mac OS:** Open the Terminal app in the upper-right corner of the menu bar, or by pressing `Command-Space bar` and then typing "terminal".

Going forward, I will refer to either of these as shell



#### Navigating the terminal



Type the following commands (followed by the enter key) in your terminal 

-   **pwd:** print working directory - this will show you where you are
-   **ls:** list directory contents - this will show you what's there. A useful variation is `ls -al` which will also show hidden files
-   **cd:** change directory. This command requires the directory name as argument. `cd ..` will change into the directory above the current directory, `cd foo` will change into the directory foo (if it exists!)
-   **mkdir:** create a new directory, e.g.,  `mkdir bar` will create a new directory called 'bar'
-   **Tab:** the auto-completion key. This saves a lot of typing. It will autocomplete command names, file names, and directory names.



#### Testing your installation



-   open a cmd-line (powershell etc), and type python (or python3 if python does not work)
-   You should see a line stating something like this

    Python 3.11.5 | packaged by conda-forge | (main, Aug 27 2023, 03:34:09) [GCC 12.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.

-   your information may be different, but it should say conda-forge
-   You can now use this Python shell (it comes in handy as a calculator)
-   To exit, hit `Ctrl-d`
-   Next, enter `spyder` this should open the spyder IDE.
-   Create a simple filed called `hello_World.py` with a suitable print statement.
-   exit spyder
-   from within the terminal, check that you can see `hello_World.py` (use the ls or dir command)
-   if yes, type the following `python -m hello_world`, which should run and execute the print statement. If no, consult  [https://realpython.com/run-python-scripts/>](https://realpython.com/run-python-scripts/>)to explore where things might have gone wrong.

At this point, you have learned:

-   how to install python
-   how to install Python libraries
-   how to use spyder to create/edit Python code
-   how to start a Python shell from the command line
-   how to run a Python script from the command line.

There is a lot more to this, i.e., how to create icons for your Python programs, how to create file associations so that you can open jupyter-notebooks with a right click, but all of this is operating system dependent and sometimes takes a bit of fiddling. Also, spyder may not be the preferred IDE, but all of this is way beyond what we can do today. [https://realpython.com/run-python-scripts/>](https://realpython.com/run-python-scripts/>)is a good starting point for your own explorations 



## Writing a shell script



When we write a shell script, we need some way to interact with the user of that script, i.e., what happens if the script needs a filename, but none was supplied. Thankfully there is a library for this (note that his code will not work inside a jupyter noteboook, this is strictly for interactive programs)

    #!/usr/bin/env python3
    """ Name: filename_test.py
    Author
    Date
    Description: This code
     - imports the argparse library,
     - creates a parser object
     - tells the parser object that this program requires a
       'filename' argument
    - parse the cmd line parameter(s)
    - assigns the cmd-line parameter to the variable fn
    - print the result of this operation
    """
    import argparse
    
    parser = argparse.ArgumentParser()
    parser.add_argument("filename")
    args = parser.parse_args()
    fn: str = args.filename
    print(f"fn = {fn}")

Copy this code into spyder, save it as `filename_test.py`, and use the terminal to test this:

    python -m filename_test
    python -m filename_test ammonium.csv
    python -m filename_test ammonium.csv 1 2
    python -m filename_test -h
    python -m filename_test --help

Neat! argparse will inform us that the program must be called with a filename! It will also inform us that we provided some unknown parameters! We can even use the `--help` argument (also a `-h`).

For our linear regression code, we also need to know which data columns to use. We can use the `add_argument()` to add further arguments like this

    parser.add_argument(
        "-x",  # short name
        "--independent_variable",  # long name
        dest="x",  # store as args.x
        type=int,  # check that it is an integer number
        required=True,  # this is a required parameter
        help="Column index of the independent variable",  # help text
    )

Make sure you use the above **before** you call  `parser.parse_args()`
So you program would look like this now:

    import argparse
    
    parser = argparse.ArgumentParser()
    parser.add_argument("filename")
    parser.add_argument("-x", "--independent_variable",
        dest="x", type=int,required=True,
        help="Column index of the independent variable"
    )
    parser.add_argument("-y", "--dependent_variable",
        dest="y", type=int, required=True,
        help="Column index of the dependent variable",
    )
    args = parser.parse_args()
    
    fn: str = args.filename
    x: int = args.x
    y: int = args.y
    print(f"fn = {fn}, x={x}, y={y}")

To make the program complete, we need to add some description that explains what this program will do or not do (you may also need to add a few more options, like the the Shapiro-Wilk override). You can add a general program description when you create the parser object:

    parser = argparse.ArgumentParser(
                        prog='ProgramName',
                        description='What the program does',
                        epilog='Text at the bottom of help')

and optional parser arguments like this

    parser.add_argument("-s", "--Shapiro-Wilk Test",
        dest="shapiro", required=False, default="yes",
        help="Perform normality test using the Shapiro-Wilk test"
    )

Add these elements to your program, and test that everything works as
expected. Then what is left to do, is to copy your code from the capstone
assignment into this command-line script, and test it against the various
datasets.  Maybe add some code to automatically set the filename of the figures
you are creating, add an option to select pdf or png format, etc. etc.



## Where to go from here



-   do a computer science minor
-   learn more about the data-science part. Either through formal online courses or by following blogs like [https://towardsdatascience.com/>](https://towardsdatascience.com/>)etc.
-   learn more about machine learning. Lots of online resources, and also a couple of ES profs who might take you on for an ESS391etc.
-   Improve your general coding skills. There are a couple of really fun, game-like online courses (many are free).
-   Start writing your own little programs. I.e., learn how to parse command line arguments, and add this to the regression-analysis program so you can call this like \`rega.py file=foo.csv, X=1, Y=12, log="XY"\` etc. The possibilities are endless and will keep you entertained for years. Have a look at [https://python.plainenglish.io/13-advanced-python-scripts-for-everyday-programming-1a52acb84101](https://python.plainenglish.io/13-advanced-python-scripts-for-everyday-programming-1a52acb84101)
-   Browse StackOverflow
-   Start a python club in your ES department
-   here are a couple of links with ES related Python content
    -   [https://link.springer.com/book/10.1007/978-3-030-78055-5>](https://link.springer.com/book/10.1007/978-3-030-78055-5>)you can download the pdf
    -   [https://www.earthdatascience.org/tutorials/python/](https://www.earthdatascience.org/tutorials/python/)
    -   [https://earth-env-data-science.github.io/intro.html](https://earth-env-data-science.github.io/intro.html)

