# Previous Lesson Overview - Day 9 Intro to seaborn visualizations

Instructor/Notebook creator: Maria D Hernandez Limon

**AI assistance:** Maria used ChatGPT on *Aug2025* to help with commenting code, brainstorm and design of homework problems.
I reviewed/edited the code, added comments, and validated results with tests/spot checks.

In the previous lesson you learned about:
1. Plotting with Seaborn 
2. Simple stats with SciPy
3. Maps with CartoPy and GeoPandas
4. Making gifs

# Day 10: Beyond Jupyter 

There are many ways to run python code to make it more sharable and faster! In this lesson we will discuss how to do that.


**AI assistance:** Maria used ChatGPT on *Aug2025* to help with commenting code, brainstorm and design of homework problems.
I reviewed/edited the code, added comments, and validated results with tests/spot checks.


# 0. Project Set-up

When starting a new project you want a diretory that looks like this:
```
project/
├─ data/raw/      # never overwritten
├─ scripts/       # exploration notebooks & reproducible scripts
├─ modules/       # your reusable functions
├─ outputs/       # figures / tables
└─ requirements.txt #some read me with instructions
```

- Ideally this space is also a github repo that you update often!
- And you have a mamba/conda env that houses all your packages

# 1. Running Python scripts

## A - Jupyter notebooks

A Typical Research Workflow

1) Data collection & questions

- **Collect:** collect or create data to explore
- **Document:** in `README.md`, always write documentation of what you are working with
- **Questions (examples):**
  - Is ice cover decreasing over time in each lake?
  - Do ER vs SU daily temperature distributions differ?

2) Explore in Jupyter

3) Confirm patterns with simple stats

4) Scale to High Performance Computing (supercomputers like UChicago RCC), need .py

5) Produce final figures (consistent & saved)

6) Share the work (reproducibly)

## B - `.py` Files

This allows for automation and large scale.

### [This](http://stanford.edu/~jainr/basics.py) site has a great explanation of .py. 
In general there are two ways I use .py scripts.

**1. Save functions you have written in some application like jupyter for future use**

**2. Write .py scripts to run on your computer's terminal (or a super computer)**

**How to write a .py script:**

To create a .py script you open any text editor, add your code and save the file with a .py at the end.
Your code would be the lines we have written here. See `lake_wanted_timeseries.py` for an example.

**NOTE: Change the datadirectory in lake_wanted_timeseries.py**

There are a few parts that your script will need:

1. add shebang line `#!/usr/bin/env python3` this tells your computer what language your script is in
2. import libraries
3. your code - including directories for your input/output
4. To run a .py we must first make it executable by navigating to where our script is...
   - open a terminal and type `cd directoey of your .py`
   - and typing `chmod +x our_script.py`
5. Then to run we use `python3 ./our_script.py anyargumentsforourscripthere`
6. Our we import by using ```python from our_script import some_function```

**The shebang line**
    - When you execute a file from the shell, the shell tries to run the file using the command specified on the shebang line. The # character is used because it defines a comment in most scripting languages (including python), so the shebang line will be ignored by the scripting language by default.
    - Some background on the shebang which is not necessary to know but may be interesting to some: "The shebang line was invented because scripts are not compiled, so they are not executable files, but people still want to "run" them.  The shebang line specifies exactly how to run a script.  In other words, this shebang line says that, when I type in `./my_script.py`, the shell will actually run `/usr/bin/env python my_script.py`" - Sam King

### 1. How to import functions from a .py?
You can read functions you have in .py scripts without making your script executable. Still need the shebang line so your computer knows what language the code is in.

Note: if your functions don't use any python library then you can import them from a .py file without a shebang.

Let's make `func_home.py` with the following functions.

```python
#!/usr/bin/env python3

# the libraries your functions need
import numpy as np

def print_name(some_name):
    print(f"the name is {some_name}")

def print_len_name(some_name):
    print(f"the length of the name is {len(some_name)}")

def print_max_num(some_name):
    return len(some_name)
```


I am going to make func_home.py now. 

  1. Crete a new text file
  2. rename to func_home.py
  3. start with shebang
  4. include code
  5. Make sure your file lives in the same directory as the notebook you are working on - if not there are ways around it but I never do that... so I will let you look it up :)

In [None]:
from func_home import print_name
print_name('maria')

from func_home import print_len_name
print_len_name('maria')

from func_home import print_max_num
print_max_num('maria')

#### Check-in [1]

write your own my_funcs.py, make it executable, write a function that returns and prints how many letters are in your favorite color.Then in this notebook  (Day 10) import the function like we did above. Tell me how many letters my favorite color 'green' and yours has.

In [None]:
# import functions

# find our answers

#### answer

In [None]:
def print_len_col(some_color):
    return(len(some_color))
    print(f"the length of the color is {len(some_color)}")

print_len_col("green")  

In [None]:
from my_funcs import print_len_col

print_len_col("green")

### 2. How to run a .py?

After you make your .py executable you run by navigating to the location of your script and running the following command:

`python3 ./our_script.py anyargumentsforourscripthere`

in the example below lake_wanted_timeseries.py is my script and it take a lake initial as input so I am typing SU for superior.

In [None]:
import os

I like to use os when I need to execute code and don't want to leave my notebook.

`os`is Python’s standard library module for talking to your operating system.
It handles things like folders/files, environment variables, and running shell commands.
No install needed: it ships with Python.

In [None]:
os.system('chmod +x lake_wanted_timeseries.py')
os.system('python3 ./lake_wanted_timeseries.py "SU"')

For the shell’s purposes, a command which exits with a **zero exit status has succeeded**. A non-zero exit status indicates failure. --https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html

I can leverage the functionality of fstrings and do the following, instead of writing each line 5 times (one per lake). And it also leaves a record of the commands I ran in the terminal in my jupyter notebook.

In [None]:
for lake in ['SU','ER','MI','ON','ER']:
    os.system('python3 ./lake_wanted_timeseries.py {}'.format(lake))

### Check-in [2]

Use the method above to write amic_ts.py, this script should:
- take in a lake wanted as an argument 
- read in amic housed in the data folder as great_lakes_maxice.csv, tip save as temp and use as much code from lake_wante_timeseries.py you can
- subset the data to only include lake wanted
- save the figure made 
- remeber to make the code executable
- run in a for loop so you have an image per lake 

#### answer

``` python 
#!/usr/bin/env python3

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sys

#select a wanted lake
lake_wanted=sys.argv[1]

#this is the directory where we want to store the data we finish analyzing---------CHANGE
data_out_directory='../output/'

#import data
temp_data = pd.read_csv("../data/great_lakes_maxice.csv")

##subset data
lake_subset=temp_data.loc[temp_data['Lake']==lake_wanted]
lake_subset.plot(x='Year',y='Cover',kind='line',title=f'{lake_wanted}')
plt.tight_layout()
plt.savefig(data_out_directory+f'{lake_wanted}_timeseries.pdf')

```

In [None]:
os.system('chmod +x amic_ts.py')

In [None]:
os.system('python3 ./amic_ts.py "Superior"')

In [None]:
for lake in ['Superior','Erie','Michigan','Onario','Erie']:
    os.system('python3 ./amic_ts.py {}'.format(lake))

I know these examples are very simple but the point is to illustrate functionality. 


*** More on OS *** --- interacting with terminal via jupyter

OS description: https://www.geeksforgeeks.org/python-os-system-method/

OS information: https://thomas-cokelaer.info/tutorials/python/module_os.html

## C - Command Line : SKIP

The Command Line by Amanda Farah

The command line is kind of like a much more powerful Finder, but you have to type everything instead of clicking. You can move files, copy them, create them, rename them, move between folders, create them, rename them. In the command line, you can launch any application you want, open any file you want, search for files, search for strings in files, and much more. Think of it as a way to speak more directly to your computer.

The command line is a very powerful interface for your computer, but it is also a universal interface for all Unix-based computers (Macs and PCs running Linux or other Unix operating systems, but not Windows machines). Therefore, the command line is how you will interact with computing clusters/supercomputers. Most clusters don't have a way to control them graphically (i.e. by clicking on things). Instead, they just have a command line interface. Even if they do have a graphical interface, its usually very cumbersome, slow, and limited. Therefore, learning command line syntax is the way to go. Therefore, when you eventually scale up your code to use clusters, you might find it helpful to come back to this notebook.

The languages most commonly used on the command line are `bash` and `zsh`. They are very similar, and for our intents and purposes, they are the same. I'll probably just refer to both of them as `bash`.

### 1. Commands you know and love
Let's start with the usual commands we do every day. Open a new terminal window and type the following:
```
cd <your_path>/intro-programming-2022 
```
`cd` means "change directory". So this command changed our directory from wherever the terminal started to your intro programming directory. Here, the _command_ is `cd` and the _argument_ is the path you would like to navigate to. 
```
git pull
```
Git is a software installed on your computer, like python or microsoft word. This line says "run the software called git. Now that you know I'm talking to the software called git, I want you to specifically run the part of the software that pulls modifications from an online repository to my local version"
```
cd notebooks
```
Again, change directories. This time, we move to the `notebooks` directory.
```
jupyter notebook
```
Jupyter is another software installed on your computer. This line says "run the software called jupyter, and specifically run the notebook part of that software." Now, Open today's notebook and give a green check ✅ when you're ready.

### 2. New directory-navigation commands
Now for some more commands related to directory navigation. 

It might be useful to have this picture of our directoy stucture in your head as we proceed. 
```
intro-programming-2025
    |
    - README.md
    - notebooks
        |
        - Day0_Setup.ipynb
        - Day1_Data_and_Storage.ipynb
          ...
    - data
        |
        - Pokemon.txt
        - star_formation_rate_MD.csv
         ...
```
Here, `intro-programming-2022` is _above_ `data`, and `Pokemon.txt` is _below_ data. `data` is _below_ `intro-programming-2022` and _level with_ `notebooks`.

0. Open a new terminal tab or window. Then, navigate to the `notebooks` directory like we do every day. Give a ✅ when you get there or a ❌ if you run into errors.

1. Lets make a new directory called "my_directory."
    ```
    mkdir my_directory
    ```
    Here, the _command_ is called `mkdir` for "make directory" and the _argument_ is the name of the directory. The user supplies the argument, and in this case we decided to pass `my_directory`. 
2. Lets check that we have made the directory by listing everything in our current directory.
    ```
    ls
    ```
    Here, `ls` stands for "list". You should see listed all the contents of the `notebooks` directory, where we currently are. Check that, in addition to all of the notebooks we've used so far, you see `my_directory`.
    - Sidenote: You can list files in other directories without navigating into them. Try listing everything in the above directory
    ```
    ls ../
    ```
    Where `..` means "move up one". You could also do `../../` to move up two, and so on. Try listing everything in the `data` directory.
    ```
    ls ../data/
    ```
    - Sidenote: You can list files that have a specific pattern in them using _wildcards_. List all of the jupyter notebook files in the current directory:
    ```
    ls *.ipynb
    ```
    List all of the `.csv` files in the `data` directory:
    ```
    ls ../data/*.csv
    ```    
    Note: You can use wildcards for many commands in the command line, not just `ls`.
3. Navigate into our new directory
    ```
    cd my_directory
    ```
    Again, `cd` means "change directory."
4. Check that we've navigated there
    ```
    pwd 
    ```
    This command means "print working directory" and the output gives the full path to your current location in the file structure. Give a ✅ if its output makes sense to you, and a ❌ if not.
5. Make a text file in the directory called "my_file.txt" with the words "Command lines are cool" in it. You can do this in many ways, but here is a simple one.
    ```
    echo "command lines are cool" > my_file.txt
    ```
6. Open the file in your default text editor.
```
open my_file.txt
```
Here, `open` is the command and the argument is the filename, in this case `my_file.txt`.

7. Move/rename the file
``` 
mv my_file.txt my_cool_file.txt
```
Here, `mv` means "move". This command moves contents of files to new locations. In this case, the new location was the same directory but inside a file with a different name. Check it worked by running `ls`. Double check by running `open my_cool_file.txt`. 

7. Delete the file
```
rm my_cool_file.txt
```
Check that it is really gone with an `ls` and and `open`. 
Note: `rm` can't be undone! There is no recycle bin here. Regular backups are important. 

7. Tab completion: Look at Amanda's terminal for this.

Once you get the hang of these commands, they become much, much faster than clicking around in a Finder window. 

### 3. New fancier commands

1. Download stuff from the internet using `curl`
    - `curl` is short for "Client for URLs." Check out the documentation [here](https://curl.se/docs/manpage.html).
    - We will download the "research and development survey" from this website: https://www.stats.govt.nz/large-datasets/csv-files-for-download/
    - do this using `curl` in two ways:
    ```
    curl https://www.stats.govt.nz/assets/Uploads/Research-and-development-survey/Research-and-development-survey-2021/Download-data/Research-and-development-survey-2021-CSV-notes.csv
    ```
    This downloads the file and displays it. But if we want to save the file on our computer, we will use the `-o` option and specify a file. We will save the data to a file called "RnD.csv" in the output directory.
    ```
    curl -o ../output/RnD.csv https://www.stats.govt.nz/assets/Uploads/Research-and-development-survey/Research-and-development-survey-2021/Download-data/Research-and-development-survey-2021-CSV-notes.csv
    ```
    Once you do this, run `ls ../output/` and see if your new file is there.
    
1. Find files using `find`
    - Finally! A command named something that makes sense. This is used when I'm looking for a file but I don't know where it is on my computer. Check out the documetation [here](https://linux.die.net/man/1/find).
    - The general form is: `find <starting directory> <matching criteria and actions>`
    - So, if we want to find all `txt` files in the "output" directory, we can do 
    ```
    find ../output -name "*.txt" -print
    ```
    The first argument is the place to look for your files (here, it was `../output`) then, its all optional arguments. These are a lot like keyword arguments in python. The first optional argument is `-name` which accepts a pattern. This is the pattern we are searching for (here, it was `"*.txt"` where we used `*` as a wildcard. The second optional argument is `-print`, which tells find to print the output of its search. Try to run the command.
    
1. Find contents in files using `grep`
    - Syntax: `grep <string to search for> <options> <starting directory>`
    - Look for all lines in all files that have the string "Huron", starting one level up, and going "recusrively" through all directoryies under our starting directory:
    ```
    grep "Huron" -r ../
    ```
    Here, "Huron" is the string we want to search for, "-r" is telling `find` to go recursively (meaning go through each sub directory under the starting directory), and "../" means start one directory up from where we are now.


It is very useful to keep a small text file (or other set of notes) of useful commands. Unlike `python`, `bash` and `zsh` don't have very intuitive syntax or naming conventions, so it can be hard to memorize their commands. Katie, Maria, and Amanda all keep a short list of useful commands, especially ones with confusing syntax, along with explanations of how to use them.

### 4. Python in The Terminal

Follow these steps:
1. Open a new terminal tab or window
2. Navigate to your working directory, like we do every day. We will work out of the `notebooks` directory.
3. Type `python`
    - Your terminal should display something similar to the following:
    ```
    Python 3.8.5 (default, Sep  4 2020, 02:22:02) 
    [Clang 10.0.0 ] :: Anaconda, Inc. on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 
    ```
    - If it displays `Python 3.X`, where `X` is any value, you are good!
    - If it displays `Python 2.Y` for any value of Y, type `exit()` and press enter. Then type `python3`. Ask a TA if this doesn't work.
    
4. You are now running python interactively! Let's try a few things out to see how interactive python works.
    - Type a print statement with a string as the argument to `print()`. Press enter
    - Type a mathematical expression (e.g. `1+1`) and press enter
    - Notice that inputs are always on lines denoted by `>>>` or `...` and outputs start newlines without these characters.
5. Let's try some more nuanced commands that we are used to using on Jupyter Notebooks.
    - Try some of our usual imports (e.g. `import numpy as np`, then try `import matplotlib.pyplot as plt`)
    - Define a variable, calling it whatever you want and giving it whatever value you want, using any datatype you want. Press enter.
    - Type the variable's name. Press enter.
5. Now let's try some code we've seen before to see what happens in interactive mode.
    - First, I'll copy some things from Day 2, where we had some fun with lists. Start by replicating the following line in your terminal, then follow along with the instructor. Try to predict what will happen with each line before pressing enter.
    ```python
    animals = ['cat', 'dog', 'squid', 'moose', 'falcon']
    ```
    - Now, we will try some things using `pandas`. Remember, to use a package we need to import it first. We haven't imported `pandas` yet, so lets do that now.
    - We're going to load in some data that we used earlier. This data is stored in a csv format. I don't quite remember the syntax for the `pandas` function for this, but we don't want to leave the terminal because that's a lot of work, so lets try the built in `help()` function. 
    - It works very similarly to using it in jupyter! you can even scroll. However, there's no buttons to close out of the viewing mode once you're in it, so you'll have to type `q` to leave after you've read everything you need.
    - You can re-run the same line without having to re-type the whole thing by using the "up" button on your keyboard.
    - Looking at the docstring, it seems like the first argument is the file name, and all other arguments are optional. the default delimiter is `,`, which is exactly what we want, so we can just go with default behavior.
    - Load in the file but don't assign it to a variable. You'll have to figure out the path. The file's name is `Pokemon.csv` and it is in the `data` directory. 
    - You should see the column names as well as the first and last 5 rows of data.
    - Now load in the file but this time assign it to a variable. I'll call my variable `data`. Notice how there's no output now.
    - _**Coding check-in:**_ Play around with your dataframe. How many Pokemon have Grass as their first type? What's the mean attack value?
6. How does plotting work?
    - With our nice graphical user interface (GUI) gone, we're stuck with a very basic looking screen that doesn't seem like it has support for the beautiful plots you've learned how to make. What happens when you try to make a plot?
    - We've already imported `matplotlib` and `numpy`, so we have all the libraries we need for a simple plot. 
    - Make an array full of increasing `float` values.
    - Plot the square of that array vs the array itself.
    - you might get taken away from the terminal for seemingly no reason. There is a reason, just go back to the terminal for now.
    - Run `plt.show()`. A plot should pop up.
    - You must close your plot to have access to the command line again.
7. Functions
    - Write a simple "hello world!" function.
    - You have to provide your own indentation in interactive python. Otherwise you will get the dreaded `IndentationError: expected an indented block`!
    - An empty newline indicates you're done writing the function. 
    - `...` indicates being inside an indented block, `>>>` indicates being outside of it
    - Call your function.
    - Write a function that takes in one argument, applies something to it, and returns a new value.
    - Call your function.

If I'm writing a super long function or a complex plotting script and I make a typo, I have to go and redo everything. Even with our newly-beloved "up" key, this is extremely annoying. An obvious solution might be to go back to jupyter notebooks, but there are many situations in which that is not possible. Instead, we will move on to a generally useful tool, `.py` scripts.

9. Leave interactive mode by typing `exit()`. Don't close your terminal window.

### 5. **Maria** say no coding in terminal unless you have a .py

The explanations for Amanda above are great so I am keeping them as a resource to you.
However, PLEASE write all your code in a .py or .sh or some file that is saved. 
Anything you write in the terminal is basically lost when you close the session.
Future you and other collaboattors will truly hate you if you can't replicate the code you only gave the terminal.


## D- in supercomputers (HPC)
HPC = High-Performance Computing.

It’s the practice of using many powerful computers together (a cluster / supercomputer) to run big or time-critical jobs faster than a laptop can. At Uchicago go we many the [RCC](https://docs.rcc.uchicago.edu/).

What it looks like:

- Cluster: lots of nodes (machines), each with many CPU cores and often GPUs, connected by a fast network.
- Scheduler: software (e.g., SLURM) that queues your jobs and gives them resources (cores, memory, time).
- Storage: shared filesystems (home, project, scratch). You move data in/out and read/write from your jobs.

Why use it:

- Your data/code is too large/slow for a laptop.
- You can run many jobs in parallel (e.g., one per lake/parameter).
- You need GPUs or lots of memory/cores.

How to communicate:
- Need .py or .R files that house your actual code
- Need .sh that tells the computer how/where to run your .py -- this is needed because HPC has a scheduler and so you need to pass instructions

the [2022 advanced course](https://github.com/NRT-DSEER/computing-for-research-2022/tree/main/notebooks) has a really nice explanation of github on day6, take a look!

### 1. Working with .sh

- .sh known as shell scripts are written in bash which is the language we use to pass commands to the terminal --like cd and ls [Shell_scripts](https://www.shellscript.sh/)

- If you have multiple .py scripts that need to be run sequentially or maybe a .py and a .R script we could house them both on a .sh and run the .sh script. 

- If you find yourself typing many commands on the command line you can also write them in a .sh script and run that instead of writing every command on the command line repeatedly

- all shell scripts start with a shebang too `#!/usr/bin/env bash` but the info on the shebang gets more complicated if you are running on a super computer vs your local computer. 
See `lakes_wanted_local_example.sh` and `lakes_wanted_rcc_example.sh` for very simple examples.

- we also use `chmod +x myscript.sh` to make shell scripts executable (regardless of weather you are on your computer or a super computer)

####  use to run .sh locally [lakes_wanted_local_example.sh]

``` bash

#!/usr/bin/env bash

for i in SU ER ON HU MI
    do
    python3 ./lake_wanted_timeseries.py "$i"
    done 

```

In [None]:
os.system('chmod +x lakes_wanted_local_example.sh')
os.system('./lakes_wanted_local_example.sh')

#### to submit sh to super computer [lakes_wanted_rcc_example.sh]

``` bash

#!/bin/bash
#SBATCH --job-name=example_sbatch
#SBATCH --output=example_sbatch.out
#SBATCH --error=example_sbatch.err
#SBATCH --time=00:05:00
#SBATCH --partition=broadwl
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=14
#SBATCH --mem-per-cpu=2000

#we load python via anaconda first
module load python/anaconda-2020.02

#then we tell the computer to run our .py script
python3 ./lake_wanted_timeseries.py 'SU'

```

You must be on a super computer cluster first and your .sh script and scripts inside this script should be executable and on the super computer too. Then:

Navigate to your script, after you've changed the info on the .sh shebang type the following on the terminal.

`sbatch ./lakes_wanted_rcc_example.sh`

Your job will then be added to a queue and executed when it becomes your turn.

### 2. Working on Midway 

The [Day 6 lecture](https://github.com/NRT-DSEER/computing-for-research-2022/blob/main/notebooks/Day6_RCC.ipynb) from the advanced coding class in 2022 covered Midway and running jupyter notebooks on midway, please consult their lesson. 

The [RCC](https://docs.rcc.uchicago.edu/) website is also a great resource and they run [trainings](https://rcc.uchicago.edu/support-and-services/workshops-and-training) all the time. 

The workflow below is what I do.

#### jupyer notebooks
**For data exploration that requires more power:**

0. Open terminal
1. shh into midway3 ssh uchicagoid@midway3.rcc.uchicago.edu
2. enter password and validate -- you need special permission 
3. navigate to my working directory
4. I request an interactive session:
     ``` sinteractive --exclusive --partition=caslake --nodes=1 --time=01:00:00 --account=pi ```
     Avoid working on the login node.
     The interactive session will load from you last directory. So be in the right place.
     After your time is over the partition closes on you so be mindful of your request.
6. load the conda env I created like we did for class ``` conda activate mdh_env ```
7. find my ip adress ```/sbin/ip route get 8.8.8.8 | awk '{print $NF;exit}' ```
8. run my jupyter ```jupyter notebook --no-browser --ip=numnber_from_6```
9. What I looks like a regular notebook in my own computer but what I run is working on the partition "caslake" on the rcc
10. I work on my code, then close my notebooks as normal when I fisnihs 
11. conda deactivate when I am done
12. exit to end my interactive session (if I didn't get kicked out 😔)
13. exit to end my session in midway

#### submitting jobs

0. Open terminal
1. shh into midway3 ssh uchicagoid@midway3.rcc.uchicago.edu
2. enter password and validate -- you need special permission 
3. navigate to my working directory that has my .py and .sh ready with SLURM instructions
4. submit my code with ``` sbatch my_instructions.sh ```
5. wait for it to actually run! Let it run and come back.
6. The computer works while you are away.
7. exit anc come back when you think the job is over.


# 2. Environments

Whenver we work on a project we need specific programs and modules. Versions are always changing and so code may not work if it ins't run with the same packages as when it was written. This is why it is important to create a package and share it with your code.


[Why create a conda env? -Anaconda](https://www.anaconda.com/docs/tools/working-with-conda/environments)

[Manging env - Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)

## the skeleton

for this class we made [py101]


1) on your terminal
```bash
mamba create -n py101 python=3.12 -c conda-forge -y
```

2) Activate it
```bash
mamba activate py101
```

3) Install course packages
```bash
mamba install -c conda-forge numpy pandas matplotlib seaborn jupyterlab notebook ipykernel cartopy geopandas -y
```

4) Register the kernel for Jupyter
```bash
python -m ipykernel install --user --name py101 --display-name "Python (py101)"
```

5) Launch Jupyter - this contains both jupyter notebook and jupyter lab 
```bash
jupyter lab
```

6) always mama deactivate when you finish

This is the method you need for all env in the future! 

## to update

- to install new pakages you simply repeat step 3 after activating you env!
- to do more complicated things like change versions of a packege follow the steps in mange env link above!

# 3. Github [Version control]

There will be times you change something in your code and it breaks all your work. But when that happens I know you will have an older verions you can refer to becaus eyou will be using github!

For this class I was the only person updating or notebooks but in real life many people can work on the same code. I have never collaboated like this so I am not a pro. However I do use github all the time to save my code. I have a calendar update so that every friday I update my repos. 

the [2022 advanced course](https://github.com/NRT-DSEER/computing-for-research-2022/tree/main/notebooks) has a really nice explanation of github on day7, take a look!

[github](https://docs.github.com/en/get-started/start-your-journey/hello-world) also has nice tutorial!

# 4. Skills Test

## Check -in [3] work on the quiz notebook

# 5. Review

## Where we’ve been

**Day 1 — Data & Storage**
- Files, folders, relative paths; read/write CSVs safely 

**Day 2 — Loops & Conditionals**
- `for/while`, `if/elif/else`, list comprehensions for concise transforms.

**Day 3 — NumPy**
- Arrays, vectorization, boolean masks; fast numeric ops over Python loops.

**Day 4 — Functions**
- Write reusable functions + docstrings; simple assertions for sanity checks.

**Day 5 — Intro Plotting**
- Matplotlib basics ; labels, titles, grids, saving figures.

**Day 6 — Pandas Intro**
- reading and parsing files

**Day 7 — Pandas Modifications**
- chained ops, reshape (wide↔long), merges/joins.

**Day 8 — Pandas Math**
- Aggregations, rolling windows, resampling; small pipelines that end in a plot.

**Day 9 — Seaborn · SciPy · Maps · GIFs**
- Seaborn quick viz; `scipy.stats` (e.g., `linregress`, KS), GeoPandas maps (CRS!), simple `FuncAnimation` → GIF.

**Day 10 — Beyond Jupyter**
- Scripts (`.py`) , modules for reusable logic, job wrappers (`.sh`) + HPC basics (SLURM), reproducible outputs.

---

## Victory lap — what you can do now (at a glance)

- Load & clean data → explore in a notebook → confirm with a small stat → plot clearly → save reproducibly.
- Turn a notebook idea into a **script** + **module**, and (optionally) run it on **HPC**.
