# Lesson 1: Introduction to Computing <a name='home' />

Welcome to this Python workshop. Dr. Mel Yang has been running this Python Bootcamp for years, and now Dr. Steph Spera is bringing some geography to the mix. Over the semester, you will learn the basics of python, with examples that include biological (mostly genetics) and geography content (mostly physical geography).  This course is going to be presented in jupyter notebook, which we'll talk about later! 

Many of these lessons are directly taken or adapted from a Python bootcamp at UC Berkeley that Dr. Yang helped to teach from 2012-2016. She has also taught these lessons the summer of 2017 at the Institute of Vertebrate Paleontology and Paleoanthropology at the Chinese Academy of Sciences and has been tinkering with this throughout her time at UR with her students. Dr. Spera is bringing some of her own working knowledge, and some lessons based on resources from CU-Boulder and the University of Wisconsin. We ask that the biology students be open to learning some geography, geography students be open to learning some biology, and everyone just embrace learning and flexibility. 

## Lesson 1 Topics:
- <a href=#bookmark1>1. Expectations for the course</a> 
- <a href=#bookmark2>2. Navigating Linux</a> 
- <a href=#bookmark3>3. Jupyter Notebooks</a> 
- <a href=#bookmark4>4. Writing in Python</a> 
- <a href=#bookmark5>5. Comprehension Checks</a> 
- <a href=#bookmark6>6. Appendix 1: yourfirstname_jupyter.sh: Format of Slurm Script to run Jupyter Notebook</a> 
- <a href=#bookmark7>7. Appendix 2: Common Errors Running Jupyter Notebook through Slurm</a> 


## 1. Expectations for working through these lessons <a name='bookmark1' />

The idea behind these lessons is to create an environment for you to intensively work on developing your python programming skills and give you a crash course in the basics of the language. By the end of the course, you will not be an expert on python, but you should be able to write simple scripts and have the groundwork and general vocabulary to effectively use the internet and/or other resources to teach you fancier tricks. Most importantly, if you have a science question, you should to start envisioning what ways you can write code that will help you arrive at an answer. 

Keep in mind - learning to code is like learning a new language. It takes lots of practice, and it continuously feels hard and unintuitive until suddenly, it starts sticking. Just keep trying and **asking questions**--it will get easier and easier!

Some potentially useful python references include:
- [Linux cheat sheet](https://cheatography.com/davechild/cheat-sheets/linux-command-line/)
- [Style guide for Python](http://www.python.org/dev/peps/pep-0008/)
- [Learn Python](https://www.learnpython.org/) 
- [Python Website (documentation)](https://docs.python.org/3/index.html)
- [Python Code Visualization](http://www.pythontutor.com/visualize.html#mode=edit)

Click through some of these links, perhaps bookmark them - note what each of them are useful for, so you can use these as needed throughout the class. For instance, the Python Code Visualization will likely be useful as you start learning to use for loops in Python - they help walk you through the logic of how a computer goes through the code. Learn Python is another resource for learning Python - if you don't understand something here, finding a reading on that topic there may help. 

A half-credit course should take about 3-5 hours of your time, including the hour we meet. We expect you come to the lab every Friday having done the Lesson or the Exercises and be ready with questions.

<a href=#home>Return to Top</a> 

## 2. Navigating Linux <a name='bookmark2' />

Before we touch any python, we have to go over Linux. Linux is an open-source operating system (similar to Windows or a macOS). It manages hardware resources, facilitates communication between softtware and hardware, and provides a user interface for interactions. It is WIDELY used, particularly when servers and clusters are involved.

The shell is the the way you interact with your operating system. Most of you are used to the graphical user interface (GUI) you see when you turn on a computer - clicking the Start button allows you to access a folder, which contains files that you double click to open within the relevant default application. However, there is a command line user interface (CLI) that is very useful as well. Both Linux and Mac operating systems use the CLI **Terminal**, in which we use the Linux language for navigation. In Windows operating system, this CLI is the **Command Prompt**, which uses a different language. However, Windows 10 and higher now have a Terminal software installed that uses Linux. If you have an older Windows interface, installation of other CLIs like cygwin or MobaXTerm will allow you to use UNIX in a Windows computer - let us know if you need this. 

The next vital piece in writing python code is a text editor. Python code is usually written to a text file ending in ".py", and then you can use the Linux commands to access and run the python script from your shell (Terminal). Sometimes, you may also want to use an interpreter, or a program that allows you to directly execute commands without calling on a script you save to file. Here, we will gain familiarity with the Linux language, and below, learn how to use Jupyter Notebooks. 

During this semester, we will work from UR's remote cluster, *Spydur*. It is ideal because you should not need any new software (unless you have an old version of Windows, which we can deal with), and all you do need is an internet connection, and if you're off campus, UR's VPN. 

To get started, we want to make sure you are a user on the Spydur cluster and can access the directory `/scratch/myang_shared/lab/`. In the next section, you will see if you can log into Spydur and access the directory. 

### 2.1 Logging into Spydur

To log into Spydur, we will just the Linux command `ssh`. 

In either Terminal (on Macs) or using 'Windows Powershell' (on Windows computers), type the following to log into Spydur (`ssh`), move into the class folder (`cd`), and make a directory for yourself within which you make another directory for the Python Bootcamp (`mkdir`). After each command (e.g. `ssh username@spydur`), hit enter to run it. 

You should have been informed your username, but if you aren't sure, please let us know. 

```bash
ssh username@spydur
cd /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/
mkdir yourfirstname
cd yourfirstname
pwd
```

`pwd` stands for 'Print Working Directory', which is a handy command because it is very easy to get lost in directories. If you did this correctly, the screen should have printed, `/scratch/myang_shared/lab/PythonBootcamp/Sp24/students/yourfirstname`. If the commands above did not work, please contact your instructor (Yang for bio kids, Spera for geog kids).


### 2.2 Useful Linux commands

Sometimes it is hard to tell if you're actually logged into a cluster (or server) or not. The typical way to tell is to look at the bottom text in the Terminal -  it is of the form: `< username >@< whatever computer >`. If you're on Spydur, it should say `yourusername@spydur`. If you're not on Spydur, you might be on the Terminal on your personal computer - it's always good to check!

Type `pwd`. From `pwd`, you get a file path indicating the set of folders leading to your current directory. 
Type `cd ~/` then type `pwd`.
If it says `/home/yourusername/`, that means you're in your home directory, which is private to you. Note here that `cd` means change directory, and `~/` indicates your home directory. 

For the Python Bootcamp, we're working from a shared folder (`/scratch/myang_shared/lab/PythonBootcamp/Sp24/students/yourfirstname`), so any of us could hop over and take a look at what someone else did. 

If you were navigating as you normally would on your computer, you would have to click on each of these folders, from left to right, to end up in the folder the Terminal is waiting in (your 'working directory'). In the terminal window, you can type commands to do the same thing your mouse would do when you left or right click on the folder. Below are some of the major commands. 

* **pwd**   = print working directory 
* **ls**    = list all files/folders in the directory
* **cd**    = change the working directory (type the filepath to the requested directory)
* **echo**  = print to screen whatever string follows the command
* **mkdir** = make a new directory
* **rmdir** = remove a directory (can only be done if the directory is empty)
* **cat**   = print all contents of a file to screen
* **rm**    = remove a file
* **cp**    = copy a file (to a new filename and/or directory)
* **mv**    = move a file (rename file or move to a different directory)
* **head**  = print the first few lines of file to Terminal (default = 10)
* **tail**  = print the last few lines of file to Terminal (default = 10)
* **grep**  = search for an expression and print line(s) containing that expression
* **cut**   = print certain columns of the file (based on delimiter or by character)
* **less**  = opens a file within the Terminal screen as a text file (use q to exit)
* **man**   = gives manual for UNIX command you include after 'man'


Logged into Spydur, try the following:  

```bash
cd /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/yourfirstname  ## Change into your folder in our shared directory on Spydur
pwd                   ## Where are you now in your computer directory?
mkdir Lesson1/
cd Lesson1/           ##Check and make sure you are in your Lesson1/ directory. 

# In your Lesson 1 Directory, 
echo hello
echo hello > hello.txt     ## '>' indicates rather than print to screen, print to the specified file. 
cat hello.txt         
ls -lrth                  ##ls is all you need, but '-lrth' orders them to show your most recent file, with additional info
echo "how are you" > hello.txt 
cat hello.txt              ## What happened?
echo "I am excited to learn Linux" >> hello.txt
cat hello.txt             #how is > different from >>

mkdir testdir
ls -lrth testdir
echo "Let's fill the test directory" > testdir/testfile.txt  ##Note above that I made a new file but without being in the new folder testdir/. This is because I indicated I wanted to make the file in the testdir/ folder. 
ls -lrth testdir/

cat testdir/testfile.txt
rm testdir/testfile.txt   ##You will likely get a prompt asking if you want to do this - Spydur checks in, your computer probably won't!
rmdir testdir/     ##This will work if there are no files inside
ls

cp hello.txt hello_copy.txt
mkdir practice/
mv hello.txt practice/
cat hello.txt ##Note now that you will get an error, because you have moved the hello.txt file outside of your Lesson1/ directory.
cat practice/hello.txt
cp hello_copy.txt practice/hello_copy2.txt
ls -lrth practice/
```

#### Important: Be very careful with rm and rmdir and the '>' sign. You can easily delete an important document, and these deleted documents are not saved in your Trash or Recycle Bin! 



### 2.3 Reading through files

In `/scratch/myang_shared/lab/PythonBootcamp/Sp24/lessons/` exists a file called `SGDP.txt`. Using Linux commands, make a directory in your directory (`/scratch/myang_shared/lab/PythonBootcamp/Sp24/students/yourfirstname/`) called `resources/`. Copy this file into your own `resources/` directory (`/scratch/myang_shared/lab/PythonBootcamp/Sp24/students/yourfirstname/resources/`. Rename this file ```SGDPinfo.txt```.

Earlier, we used `cat` to quickly look at the contents of our files. We can use `head` though, instead. `head` means to print the beginning of the file - it's default is to print the first 10 lines. For instance, if you were to write two filenames after `head`, you would get both files printed to screen in order by filenames.

```bash
cd /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/yourfirstname/Lesson1/
head ../resources/SGDPinfo.txt hello_copy.txt    ##The ../ is a quickhand way of saying 'look backwards one directory'. Note that SGDPinfo.txt and hello.txt are NOT in the same directory.  
cd ../resources/
```

Try the above code, but use `cat` instead of `head`? A problem with looking at files through cat is that it is difficult to view large files printed to Terminal. `head` and `tail` allow you to look at the head or tail of the file, seeing a much smaller set of information on your screen. 

```bash
head -n2 SGDPinfo.txt
tail -n2 SGDPinfo.txt
```

What do you think the `-n2` did? Each command has multiple options - typing `man head` allows you to see a manual indicating what the command `head` can do. 

There are times, though, when you might want to look at the whole file. Another command you can use is `less`, which opens the file within the Terminal, allowing you to look through the document one screen at a time. It opens from the top of the file, and you use arrow keys to scroll up and down. The space bar allows you to move a screen's worth of text, allowing faster scrolling. The default is word wrapping, but you can use '-S' to chop the long lines (thus you would need left and right arrow keys to view the long sentences). Typing /word will search for 'word' in the file and highlight wherever it appears. Type 'q' to exit.

```bash
less SGDPinfo.txt
```

We can also search through the content these files using `grep`. 

```bash
grep Dinka SGDPinfo.txt
```
*Note to geography kids, Dr. Yang studies human evolution which is super cool. You can find out more about the Dinka people [here](https://nalrc.indiana.edu/doc/brochures/dinka.pdf).*

**From the SGDPinfo.txt file, what information can you find on the Dinka? What other line from the file might be useful to help you parse what you are reading?**





Many of the commands listed above have sub-options that you may find useful. Here, we will illustrate a few commonly used ones and run through how to get more information on these options using `man`. Remember that above I said that `man` is a command that allows you to read the manual for the Linux command in question.

```bash
ls -lrth
man ls
```

Here, the '-l' option uses the long listing format. That is, more detail is provided. The '-l' option also includes the file sizes, but they are just very large numbers. Adding '-h' makes this human-readable. Here, that means it switches these size output so that they are based on K (kilobytes), M (megabytes), G (gigabytes) and T (terabytes) for easy reading. 
'-t' sorts your files by time, from most recently edited to the oldest edited. Since the filenames are printed to screen, we normally read the end of the printed string--thus, we reverse the order using '-r', printing from the oldest edited to the most recently edited. 

`grep` has an option to determine the line number for a term you are searching - use `man` to research `grep`. 

**Can you figure out which line numbers are the ones for Dinka?**

Return to the `SGDPinfo.txt` file using `less`. Now type `Shift + G`. This will bring you to the end of the file. Note that there's lots of blank space here. It isn't a problem here, but sometimes it can be a problem, so let's try to get rid of it. 

Make a copy of your file, just in case you mess something up. (e.g. `SGDPinfo_orig.txt`) and then use the internet to find out how to go about getting rid of the extra space. If you are are stuck, you can try typing "remove blank lines text file linux" into google. 

Look around and try a few commands - see if they work. 

If you're having trouble, try this [link](https://serverfault.com/questions/252921/how-to-remove-empty-blank-lines-from-a-file-in-unix-including-spaces) and ask me for a clue - I got one of these to work. 

**What did you type to make the new file with no blank lines?**

One more command to explore - `cut`

Try the following:

```bash
cut -f2 SGDPinfo.txt
cut -c15 SGDPinfo.txt
```

**Using what you see and `man`, what is `cut` is doing?**

### 2.4 Pipes
>(the one above the backslash \ key)

Piping with | connects Linux commands, allowing the output of one command to "flow through the pipe" to another. This lets you chain programs together, such that each one only needs to worry about one step of the process (either generating, filtering, or modifying data), without knowing or caring where it came from or where it's going to.

```bash
env
```

The `env` command returns a list of environment variables.  We won't go over what they mean here, but rather we're using this to demonstrate that some commands might return too much text to usefully view in the terminal.

However, you can use the | character to direct the output to another program to show just portion.  
For example, the head command to show just the first few lines:


```bash
env | head
```

Or, so you were just trying to find the value of the HOME variable.  You could use grep to isolate just this portion.

```bash
env | grep HOME
```

**Back to `SGDPinfo.txt`, can you find the lines for Dinka and only list the column with their SGDP_ID? (that is, combining `grep` and `cut` using `|`).**


### 2.5 Wildcards

The star functions as a "wild-card" character that matches any number of characters. From your `Lesson1/` directory:

```bash
ls -lrth
ls -lrth *txt
```

The star can go anywhere in a list of arguments you're supplying, even in the middle of words! There are [other wildcards you can use](https://en.wikibooks.org/wiki/A_Quick_Introduction_to_Unix/Wildcards) but * is the most common.

### 2.6 Linux special characters
There are many special characters in Linux - this [link](https://www.oreilly.com/library/view/learning-the-bash/1565923472/ch01s09.html) has a pretty comprehensive table. Note that it is best to avoid using these characters except for their specified function, but if you need to include them in a command, e.g. `echo`, you can use single quotes and it should realize you are using them as regular characters.

```
echo "hi!" #This will give you an error. --> -bash: !": event not found
echo 'hi!' #This will work!
```

### 2.7 Permissions

Unlike the computers you are used to, Linux doesn't automatically know what to do with files (e.g. It won't know to use Word to open a .docx document), and it doesn't even know whether a file is data or a program (and as we'll see with the programs we write, it might be different things at different times).

The first thing that controls a file is the file's permissions. You can control who can read, write, and execute (run as a program) each of your files. This command lists the permissions. In your Lesson1 directory, type:

```bash
ls -lrth
```

You should see a bunch of letters in the first column. 
* The first letter tells you whether it is a directory.
* The next set of letters tell you if a file is readable (r), writable (w), or executable (x).
* The 2nd-4th letters tell you what *your* permissions are.
* The 5th-7th tell you what your group's permissions are.
* And the last three tell you what everyone else's permissions are. 

Linux was designed to be a multi-user operating system, so even if you're the only one who uses the computer, it maintains the distinction for you, versus your group, versus everyone else.

You can see what groups you are in by typing: 
```
groups yourusername
```
If you're in Dr. Yang's lab and on Spydur, you should see 'yanglab' as one of your groups. If you can't access a file, it's likely because your group doesn't have access. 

Type `ls -lrth /scratch/myang_shared/` and look at the `data/` folder - you should see there that your group has read and executable permissions, but no writing permissions. This means you can read and copy data from this folder, and run scripts from this folder if they exist, but you cannot make a file to put in this folder, or directly edit a file in this folder. Dr. Yang does this to make sure the large datasets we download and include can't accidentally be overwritten (except by her!). 

In your `Lesson1/` folder. Can you figure out what the script below does? Copy it into your Terminal. 

```
echo "grep Dinka ../resources/SGDPinfo.txt | cut -f2" > testscript.sh
```

Let's see the permissions and try to execute the file - for a shell script (typically ending in .sh), you usually run it by putting `./` in front. 

```
ls -l testscript.sh
./testscript.sh
```

You should see something like the following:

```
(base) m1-bio-myang:Lesson1 myang$ ls -l testscript.sh 
-rw-r--r--  1 myang  staff  47 May 24 12:42 testscript.sh

(base) m1-bio-myang:Lesson1 myang$ ./testscript.sh
-bash: ./testscript.sh: Permission denied
```

This indicates that you, the user, doesn't have permission to run, or execute, this file, as shown through the `ls -l` command. But, we can change permissions using `chmod` of the format: chmod [flags] [filename]

Try the following:
```bash
chmod +x testscript.sh
ls -l testscript.sh
```

**Now, try running the `testscript.sh` file and see what happens. And just like that, you've made and executed your first Linux script! **

Along with running scripts, we usually want to designate whether folders are completely private, viewable only, viewable by a group or anyone on the cluster, or with full edit permissions by a group or anyone on the cluster. 

For your folders in our shared directory, we want everyone in our group to be able to look in your folders, but we want only you to have edit permissions, so no one else accidentally deletes a file in your folder as they're learning. To do this, do the following:
```bash
cd  /scratch/myang_shared/lab/PythonBootcamp/Sp24/students ## Make sure you're in the common lab/ directory
chmod  754  yourfirstname/ ##The numbers adjust the permission settings on your folder titled 'yourfirstname/'
```

To better understand '754', you can visit [this link](https://chmodcommand.com/chmod-2754/). Basically, it sets permissions so that the group will have read and execute access, and anyone else on the cluster will have view access. 


### 2.8 Shortcuts!

The *very* last Linux thing Dr. Spera wanted to do was to set shortcuts, because she is a very lazy typer. But feel free to skip this part, or ask to do this together when we meet as a group.  

Whenever you log into Spydur, you log into your `home` directory, i.e. `/home/<yourusername>`. Only you have access to this directory, and if you type `cd ~/` from any other directory, it will get you back there. This means, though, that everytime you want to access the PythonBootcamp directory, you have to sit and type `cd /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/yourfirstname/`. So, let's create a shortcut, that will allow us to eliminate some of that typing. 

In your home directory, you have a 'secret'/'hidden' file called a [.bashrc file](https://www.digitalocean.com/community/tutorials/bashrc-file-in-linux). You can find more info at that link, but it basically is a hidden file that is executed everytime you log into Spydur. To show hidden files, in your `home` directory, type `ls -a`.

Now type `cat ~/.bashrc`. The last line of the file should say something like:
```bash
# User specific aliases and functions
```

So, every time you log into Sypdur, it runs the code in the .bashrc file. We can add a line that will set a short variable name to the Linux command that you want, in this case, a line that allows us to create a shorthand to getting into the shared `PythonBootcamp` directory. 

```bash
# We're going to use the visual interface (vi) to edit our .bashrc file
vi .bashrc
# click on the I key, for insert
# using the arrow keys, navigate to the end of the file
# after '# User specific aliases and functions' line, hit enter
#type
alias pygroup='cd /scratch/myang_shared/lab/PythonBootcamp/Sp24/'
#hit the Esc key to get out of the insert mode
#type
:wq #to write your edits to the file and quit vi
```

If this worked, whenever you log into Spydur, you can type 'pygroup' (or you can name it whatever else you makes sense, but do use *python* as a shortcut!) and it will take you automatically to the `/scratch/myang_shared/lab/PythonBootcamp'` directory. Let's test it out.

Type `source ~/.bashrc` to have Spydur read the updated .bashrc without logging into Spydur from a new Terminal window. Then, try typing `pygroup`, followed by `pwd` - see if your working directory is the `PythonBootcamp/Sp24/` directory.

You can find more info on editing on vi [on this website from maybe 1997 which is still relevant](https://www.cs.colostate.edu/helpdocs/vi.html)

<a href=#home>Return to Top</a> 

## 3. Jupyter Notebooks  <a name='bookmark3' />

You are currently reading information in a [Jupyter notebook](http://jupyter-notebook.readthedocs.io/en/latest/). A notebook is like a document, but it is specific to Jupyter. It's like a Word docx or a Google Doc. The .docx file extension is specific to Word, the Google Doc is specific to Google Drive, the notebook (with .ipynb filehandle) is specific to Jupyter. You can open and read the files using other programs, but it works best in Jupyter, just like .docx work best with Word and Google Docs work best in Google Drive. (*Geography kids, you will be excited to know that ArcGIS Pro also utilized these [notebooks](https://pro.arcgis.com/en/pro-app/3.1/arcpy/get-started/pro-notebooks.htm)*) 

Jupyter notebooks are useful because you can take your own notes in it (like what you see right now), you can use it as an interpreter - i.e. run code), AND you can write files directly to folders and then go to the Terminal to run the script. Lots of options! Jupyter notebook can also be used as a shell (i.e. you can also run Linux commands), making it a powerful one stop shop for all the main tools you use to begin coding.

The first thing we have to do, then, is learn how to access these notebooks.



### 3.1 Accessing Jupyter Notebooks on Spydur

We created a script (found in `/scratch/myang_shared/`) for each student that will initate a Jupyter Notebook on one of Dr. Yang's compute nodes. The following lets you find this script and submit the job that has the code which initiates a jupyter notebook.

On Spydur, use the following code:

```bash
cd /scratch/myang_shared/
ls -lrth *.sh
qsub yourfirstname_jupyter.sh
less yourfirstname_jupyter.sh
```

*For more info on what's in that `yourfirstname_juypyter.sh` file, see the appendix below.*

After running Jupyter Notebook using `qsub`, check your job is running by typing `squeue`. 
This should show a row where you see 'yourname-jn', as that is the job name we set above. 
The job should be on partition yang2, which is associated with the node spdr60, and your unique job ID for the submission is on the far left.

To figure out the commands you need to open Jupyter Notebook, look at the recent files in the /scratch/myang_shared/ directory, find the most recent output file from your Jupyter Notebook job submission (it should say firstname_jupyter.o######), and print out the file information. Note the .o##### file with the largest number (i.e. the latest job). 

```bash
ls -lrth
cat yourfirstname_jupyter.sh.o####
```

Follow the instructions written in the output file. 
The ouput file should tell you to open a brand new Terminal window **NOT connected to Spydur** and paste the SSH Tunnel (which should look something like `ssh -N -f -L ####:localhost:#### username@spydur`). 
Then open Google Chrome and paste localhost:#### to see your Spydur account from an internet browser.

***IMPORTANT BELOW***
***This is where we were running into trouble in class today, 1/19/2024***
If it asks for a token:
```bash
ls -lrth
cat yourfirstname_jupyter.sh.e####
```
Use the arrow keys to go to the bottom of the file and copy and paste the url that begins with 'http://127.0.0.1:...' into your browser. That should work, and you should be able to navigate around the directories then through your browser.
The next time you log in, it should be okay if you just follow the steps above in the *jupyter.sh.o#####* file,

Back on the server, you can set up a password for Juypter Notebook so no token is needed.
To save a JN password, on the cluster:
* Type `jupyter notebook --generate-config`
* Type `jupyter notebook password`
Set your password, and make it easy to remember. 
Hopefully it will either just remember you, or you can use your set password.

***Back to regularly scheduled programming***
Once you're on Jupyter Notebook, you can click through any directories contained within /scratch/myang_shared/, such as your PythonBootcamp folder. You can also click New --> Terminal (top left corner) to open a Terminal window in your browser.

What opens is a web-based user interface with the files in the directory you opened in terminal (or your home directory when opening through Anaconda Navigator). Going to the top right and clicking "New" followed by "Python 3" under 'Notebook' will open your first jupyter notebook (ends in .ipynb). 

If you're stuck here, please contact your instructor for help.

While we do not explore all of the functions of the notebook, we highlight some of the basics here so you can navigate and write in these notebooks easily. 

* Each of these squares where we can write text is called a cell. By default each cell is a python interpreter. Going to "Cell"->"Cell Type"->"Markdown" turns it into a text displayer, allowing you to write just plain notes, as this cell current does. We will not be teaching all the different options for Markdown, but [this page](http://nestacms.com/docs/creating-content/markdown-cheat-sheet) provides some of the formatting you might be interested in.

* The top bar includes many different options. While the notebook usually automatically saves periodically, you have a 'save option', followed by options to insert new cells, move cells around, and run the cells. Holding Shift+Enter will also run the active cell. You can also switch the cell format in the dropdown box. 

* Magic commands: One useful aspect of the notebook are [magic commands](https://ipython.org/ipython-doc/3/interactive/magics.html). We'll see magic commands in just a bit.

* Last but not least, the default is that the cell treats what you type as if you wrote Python commands. For the final section of the day, we will begin to start writing in python!

Other things to note:
* The easiest way of accessing the notebook on Spydur is how it is described above. There are also ways to install Jupyter Notebook on your computer - installing [Anaconda](https://www.anaconda.com/download) is probably the easiest way to do this smoothly. 

* Note that on Spydur, you set a time in the Jupyter Notebook script you ran (remember `qsub yourname_jupyter.sh`?). The default we put in is 12 hours, but you can use `vi` to go in and edit the time if you want. 

* If you want to cancel the notebook on Spydur, type `squeue`, you will see a list of running jobs, like what is pasted below:
```
64513     yang2 jupyter-    myang  R      23:53      1 spdr60
```
This says the job is in the partition 'yang2' (also called spdr60), has run for ~24 minutes, and the job's ID # is 64513. 
You can type `scancel JOBIDNUMBER` to stop Jupyter Notebook from running. If you do this, but you want to restart it, you must go to `/scratch/myang_shared/` and restart the job using `qsub`. 

* ArcGIS Pro has it's own embedded version of Jupyter Notebook.

* Always remember that on a local Terminal on your computer, you will need to put in `ssh -NfL localhost:####:localhost:#### yourusername@spydur` to connect the notebook to your computer's browser. 


### 3.2 Magic Commands: %%writefile and %%bash

One cool thing mentioned above are magic commands, which take the form `%%command`. Open a new Python3 Jupyter notebook and type this in a cell, and it will do as the magic command asks. 

There's two you'll learn now, `%%writefile` and `%%bash`.

In [2]:
%%writefile /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/steph/Lesson1/testscript1.sh

for i in 1 2 3; do
    echo ${i}
done

Writing /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/steph/Lesson1/testscript1.sh


Adjust the path in the cell above to fit your computer and get to your `Lesson1` folder. Then to run this cell, click within the cell so your cursor is there and type < Shift > + < Enter >. 

If it runs without error, you should see:

```Writing /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/steph/Lesson1/testscript1.sh```

This indicates a new, called `testscript1.sh` file was created with the indicated text in the given file in the given folder.

We can see if this is true by going to the `Lesson1/` folder and seeing if the new shell script is there. Did you find it?

Now let's try to run the file.

Remember to change permissions through Terminal.
Then, type `./testscript.sh` to run the file.
What happens?

Try editing the above file to make it do something else - perhaps add the number four to the output. Try to look up on google how to loop over many more numbers - perhaps 1-100. Tweak the above file (maybe make a new cell and give it a different name so you don't lose the current one) based on some suggested ideas (ask for help if you need it!) to produce 1-100. 

Loops are complicated, and here we looked at a Linux loop. In a future lesson, we'll dive into 'for loops', but using Python.

In [None]:
%%writefile /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/steph/Lesson1/file1_writefromnotebook.txt

I am writing this file from within the notebook.

I can write whatever I want and it will be written to a text file in the specified folder. 

If I don't write a filepath, this text file will be written into the home directory of the notebook.

Note that the '~' indicates my HOME directory - it can replace /home/myang/, which is the home directory of my computer.


When you write code, you are writing them into a text file - above I show some written notes that I saved to `file1_writefromnotebook.txt`. 

Now let's look at `%%bash`. Run the following command. Remember, to run the following cell, click within the cell so your cursor is there and type < Shift > + < Enter >. 

In [None]:
%%bash
## This allows you to treat the cell as a Terminal screen, and you can use UNIX commands
for i in 1 2 3; do
    echo ${i}
done

Note it does exactly what the Terminal does! You can't navigate between folders here as easily, but for quick looks at files, I often use %%bash so I can see the file out put right next to where I might be writing my script (and not have to switch back and forth as much). 

Note that the `##` allows me to write comments into these Code cell. Using a `#` to comment out code is typical across a majority of coding languages

It doesn't have to be as complicated as a for loop - below I check the folder this notebook is in, as well as what else is also in the folder. 

In [None]:
%%bash

pwd

ls

#### Other useful commands for reference

1. **gzip** - Used for zipping/unzipping files that end in .gz
2. **tar** - Used for bundling/unbundlind archives that in in .tar
3. **find** - Search for files that match a pattern
4. **wget** - Download a file from the internet

<a href=#home>Return to Top</a> 

## 4. Writing in Python <a name='bookmark4' />
### 4.1 You're ready to begin.

This entire lesson, you've learned about the Terminal, writing with the Linux language, and gotten a brief intro into Jupyter notebooks. Now onto python. We will begin with the simple command:


In [None]:
print ("hello world")


As you can see, when I typed
```python
print ("hello world")
```
and pressed Shift+Enter, the notebook processed what I wrote in the cell and output it below the cell. Here, I used `print` to call the string "hello world". `print` is similar to `echo` in Linux - it tells you to print to screen whatever comes after the command inside the parantheses. We're using Python 3, which always requires the (). Python 2 didn't require this, so you may see answers online that drop the () - we'll get an error message if we forget to put the () there. 

However, you can also run these python scripts outside of the notebook. Below, we write 
```python
print ("hello world")
```
into a text file using the magic command %%writefile and then run it through the Terminal. Typically, python scripts are saved with the subscript ".py" to indicate it is a python script.

Name the file `myfirstscript.py`.

In [None]:
%%writefile /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/steph/Lesson1/myfirstscript.py
print ("hello world")


And then run it - remember to adjust all the paths to be set to your directory!


In [None]:
%%bash

python /scratch/myang_shared/lab/PythonBootcamp/Sp24/students/steph/Lesson1/myfirstscript.py
##You can also do this directly in Terminal


### 4.2 Variables

Computer programming is useful because it allows the programmer to tell the computer to perform operations that are too boring, tedious, or difficult for the programmer to do by hand. A useful computer program needs to be able to interact with the user, perform operations on changing sets of data, and make decisions about how to proceed based on conditions specific to each instance of its execution. To achieve these tasks, computer programs employ variables.

Variables in computer science are different from algebraic variables, just like algebraic variables are different from statistical variables or experimental variables. In Python, a variable is a datum with a human-readable name which is assigned a given value. Variables can be reassigned to different values as the logic of the program dictates (variables have variable values, hence variables). 

In Python, everything and every type of data, from numbers and text to vectors and functions, are called **objects**, and objects are stored in memory. Technically, the values of variables in Python are the memory address of these objects.  Variables point (reference) at data (objects). **It is often easier to think of variables as 'storing' values, though some situtations will require understanding of the more technical memory-based definition.**

Python programs use variables to store parameters taken in from the user, the execution environment, or the data your program is being called upon to process.

These variables are named whatever you like, so long as they follow a few basic but important rules:

1. Python variable names are case-sensitive, so Var and var are different variables.

2. Though variable names can contain letters, numbers and underscores ( _ ), they MUST start with a letter (a-z).

3. Variable names, CANNOT contain spaces or special non-alphanumeric characters.

4. Variables can also not be any of the following words that already have special meaning in python:

    and    assert   break    class      continue   def      del      elif
    else   except   exec     finally    for        from     global   if
    import in       is       lambda     not        or       pass     print
    raise  return   try      while      yield


For the most part, ipython will remind you that these words are off-limits by coloring these words in helpful ways when you type them.


Here are some invalid python variable names: **1sample, sampleA.1, class**

And here are some good alternatives: **sample_1, SampleA1, bootcamp_class**


Variables can reference (store) many different types of objects. Today we'll talk about three types of objects: integers, floating point (i.e. decimal) numbers, and strings.

Run the following example, through which we'll explore a few properties of variables:

In [None]:
# by the way, lines starting with the pound sign (#)
# makes them comments, ignored by the interpreter
 
s = 'hello world'
i = 42
f = 3.14159
print(s)
print('the variable s is type',type(s))
 
print(i)
print('the variable i is type',type(i))
 
print(f)
print('the variable f is type',type(f))



 In general, variables are assigned by typing the name you want to use, followed by a single equals sign, then the value you'd like to store. This is the same whether the variable you're assigning is an object of type str (a character string), int (whole number), float (non-integer real number), or any number of other fancier things you'll be using in the next two weeks.
 
 
While (as your program tells you with the handy `type()` function) **i** is currently an integer, that doesn't mean it cannot change. You can easily reassign i to be anything that takes your fancy, including the value of another variable. You would do this with a statement such as the following:

In [None]:
i = s
 
print(i)
print('the variable i is now type',type(i))


There are plenty of cases where this is exactly what you want to do, but bear in mind that once a variable is re-assigned to a new value, the old value is lost forever.

As an example, consider the case where (for some reason) you want to swap the values of two variables s and i. The first step might appear to be a line very much like the i = s statement above, but if you do this, the value of i is lost forever, meaning you can never assign it to s. This may seem like a rather abstract problem but you'll encounter similar situations more often than you might think.

### 4.3 Numerical operations

Numerical values can be subjected to a wide variety of operations. While the full list is quite extensive (see [this link](http://docs.python.org/lib/typesnumeric.html) for the full workup), the most common operations should be familiar. And, there are also python packages (which we'll get into soon), like [numpy](https://numpy.org/), which help us with all sorts of fancy numerical operations. 


Note below that standard mathematical order of operations applies, but it's far easier ... and safer ... to explicitly order compound operations using parentheses.

In [None]:
i = 42
f = 3.14159
 
# addition uses the plus sign (+)
sum = i + f
# subtraction uses the minus sign (-)
diff = i - f
# multiplication uses the asterisk (*)
prod = i * f
# division uses the slash (/)
quo = i / f
# and exponents use a double-asterisk (**)
pow = i ** f
 
print ('sum',sum)
print ('diff',diff)
print ('prod',prod)
print ('quo',quo)
print ('pow',pow)
 
x = 5
print ("x = ", x)
x = x + 1
print ("now x is one more than before = ", x)
x += 1
print ("now x is one more than before = ", x)


Before we end, here are a few more functions related to thinking about the type of the variable you are using. Perhaps you have an integer but you'd rather treat it as a floating number (with decimals), or turn it into a string. You can use coercion functions such as:
```python
int()
float()
str()
``` 
to turn an object into an integer, floating number, or string, respectively.

In [None]:
x_int = 5
y_flt = 10.2
z_str ='4'

x_str = str(x_int)
x_flt = float(x_int)

y_int = int(y_flt)
y_str = str(y_flt)

z_int = int(z_str)
z_flt = float(z_str)

print('x:', x_int, x_str, x_flt)
print('y:', y_int, y_str, y_flt)
print('z:', z_int, z_str, z_flt)
 
# What happens when you uncomment the code below?
#result = x_int+y_flt+z_int; print (result, type(result))  # What happens when you add floats and integers?
#result = x_int+y_int+z_int; print (result, type(result))  # What happens when you add all integers?
#result = x_int+y_flt+z_str; print (result, type(result))  # What happens when you add a string to numbers?
#result = x_str+y_str+z_str; print (result, type(result))  # What happens when you add all strings?

<a href=#home>Return to Top</a> 

## 5. Comprehension check  <a name='bookmark5' />

Most weeks we'll have short comprehension check questions at the end of our lessons. Before class on Friday, try and do the following: 
1. In your Lesson 1 folder make a new directory called 'Test'
2. Make a copy of your `file1_writefromnotebook.txt` file and put it in that directory
3. Using Jupyter Notebook, write a new python file, called `ILTFY.py` in your Lesson 1 directory, that will print the words, "I love that for you." (To test it out, on Terminal, type `python ILTFY.py` to see if the command works!)


We will end Lesson 1 here. Come in on Friday ready with questions and you'll move on to Exercise 1 next week.

<a href=#home>Return to Top</a> 

## Appendix 1 - yourfirstname_jupyter.sh  <a name='bookmark6' />

The details of your `yourfirstname_jupyter.sh` file are below if they are of interest. You can type `cat yourfirstname_jupyter.sh` to see for yourself. What this file does is tell Spydur to use the partition **yang2** (one of Dr. Yang's personal compute nodes on Spydur), using 5 CPUs and 10 GB of memory. In this example, Dr. Yang's default port number is **8890**. For each of your Jupyter Notebook scripts, I set a default port number, which is bookmarked on Slack under `#python_bootcamp`. Unless your port is taken by someone else, this number will generally be what you use. Any Jupyter Notebook accessed on Spydur from a remote computer at the same time must use a unique port.

```bash
#!/bin/bash
#SBATCH --account=myang
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=5
#SBATCH --mem=10G
#SBATCH --time=12:05:00
#SBATCH --job-name=mel-jn
#SBATCH --partition=yang2

# get tunneling info
XDG_RUNTIME_DIR=""
node=$(hostname -s)
user=$(whoami)
cluster="spydur"
port=8895

# print tunneling instructions jupyter-log
echo -e "
Command to create ssh tunnel:
ssh -N -f -L ${port}:${node}:${port} ${user}@${cluster}

Use a Browser on your local machine to go to:
localhost:${port}  (prefix w/ https:// if using password)
"

# Run Jupyter
jupyter-notebook --no-browser --port=${port} --ip=${node}
```

<a href=#home>Return to Top</a> 

## Appendix 2 - Common Problems Running Jupyter  <a name='bookmark7' />

### A. 'Address already in use' (after running the `ssh -N -f -L` command)

After you submit the job to run your Jupyter Notebook on Spydur (using `qsub`), you next open a Terminal on your laptop to 'tunnel' to your Jupyter Notebook job using `ssh -N -f -L ####:spdr59:#### username@spydur`. However, sometimes, you accidentally run the tunneling command more than once. If you do, this may cause an error where the Terminal screen prints out something similar to the following: 

```
bind [127.0.0.1]:8890: Address already in use
channel_setup_fwd_listener_tcpip: cannot listen to port: 8890
Could not request local forwarding.
```

If this happens, you want to reset kill all current jobs on your laptop using `ssh -N -f -L` and redo the process. 

To do this, enter `ps aux | grep spydur` on the Terminal screen NOT logged into Spydur. You should see a list of all jobs on your laptop that have the term 'spydur' in it (you will probably see more than one). Find the rows with the `ssh -N -f -L` command, and look for the number in the second column (usually about five digits). This number is the associated job ID. To kill these jobs, type `kill ##### ##### #####`, where the numbers are the Job IDs for all the rows containing `ssh -N -f -L`. 

Type `ps aux | grep spydur` to confirm those jobs are no longer present. If successful, go back and repeat `ssh -N -f -L ####:spdr59:#### username@spydur`, making sure to only do it once. If you don't get the same 'Address already in use' error, then you should be good to log into Spydur from your Browser!

### A. Browser not loading but everything else seemed to work

Every once in a while, you'll log in, `qsub` will work with no problems, `ssh -N -f -L` will work with no problems, but your browser just won't load Jupyter Notebook. First, on the Terminal signed into Spydur, type `squeue` and confirm there is a job running for your Jupyter Notebook. If not, but `qsub` worked, then it may be the case that Spydur is super busy, and your Jupyter Notebook job is still waiting to be run. If you do see your job running, however, take a look in the **yourfirstname_jupyter.sh.e####** file correlated to your job. If the port number in the link does not match the default port number you have, it's likely that another Spydur user was using your port number, so your job switched to a new port number. If so, identify the new port number in the **yourfirstname_jupyter.sh.e####** file, and run `ssh -N -f -L ####:spdr59:#### username@spydur` with the new port number on your other Terminal screen. The next time you login and run Jupyter Notebook, you should assume your port number is back to the default port number. But you can always check the new **yourfirstname_jupyter.sh.e####** to confirm. 

Every once in a while, you hit `qsub` too many times and you ended up running more than one Jupyter Notebook. This usually isn't a problem, as your initial notebook with the default port is still running. But you might have multiple **yourfirstname_jupyter.sh.e####** files, which may get confusing. However, if you see multiple Jupyter Notebook jobs running when you check `squeue`, you can always type `scancel #####` where your number if the Job ID of your Jupyter Notebook. This will kill the Jupyter Notebook job running on Spydur. 

<a href=#home>Return to Top</a> 