## Follow along and execute the cells one at a time, filling in the blanks.

These exercises will show the general form of a command and ask you to modify or replace the text to complete the question being asked.

## File/folder navigation

##### **1) In the code cell below,** replace the `___`  in the command `cd ___` to get to your home directory.

There may be more than one way to do this.

In [1]:
# complete the following command
cd ~
cd /home/haogg@colostate.edu

In [2]:
# run this to check that you're in the right place
pwd

/home/haogg@colostate.edu


##### **2) Change to your projects directory.** It has the form: `cd /projects/your_username`. David's is `/projects/dcking@colostate.edu`.

In [3]:
cd /projects/$USER
cd /projects/haogg@colostate.edu

In [4]:
# run this to check that you're in the right place
pwd

/projects/haogg@colostate.edu


Rerun from **2)** if `pwd` doesn't show your projects directory. 

##### **3a)** Change to the course repo

Provided you are in your projects directory, the class repository directory should exist. It is called *CM515-course-2025*

In [5]:
cd CM515-course-2025

In [6]:
pwd

/scratch/alpine/haogg@colostate.edu/CM515/CM515-course-2025


##### **3b)** Change directories to the module 07_Command_Line.

<mark>Hint:</mark> *there is an intervening directory!* 

In [None]:
# take a look at the files in your current directory
ls

In [7]:
# change to the intervening directory
cd modules
pwd

/scratch/alpine/haogg@colostate.edu/CM515/CM515-course-2025/modules


In [8]:
# change to the Command line module (07...)
cd 07_Command_Line
pwd

/scratch/alpine/haogg@colostate.edu/CM515/CM515-course-2025/modules/07_Command_Line


##### **3c)** Find the directory **animal-counts** and change to it

Use `cd`, `ls`, `pwd` in the empty cell below to find and navigate to "animal-counts". 

In [10]:
cd shell-lesson-data/exercise-data/animal-counts
ls
pwd

animals.csv
/scratch/alpine/haogg@colostate.edu/CM515/CM515-course-2025/modules/07_Command_Line/shell-lesson-data/exercise-data/animal-counts


Check that you're in the right directory using `pwd` and `ls`. There should be a file there called *animals.csv*.

<mark>Make sure you're in this directory before proceeding to the next section!</mark>

---

## Working with tabular files

You must be in *07_Command_Line/ ... /animal-counts* to proceed. *...* means intervening directories.

### File format

Tabular files have a regular, columnar format and are ideally *tidy*, meaning that every column can be mapped to a variable. In this section, we will explore a *csv*, which stands for **comma-separated values.**

In [11]:
# confirm that you have animals.csv
ls animals.csv

animals.csv


If you got a pink error that says *ls: cannot access 'animals.csv': No such file or directory*, go back to 3c in the previous section.

### **1) use various commands** get information about the file

To get more information about how to use a command, use `man command`. For example, to learn the usage for the word count program, do `man wc` in a terminal or code cell.

In [12]:
# get the first 10 lines of animals.csv. Hint: this command is the same as in R
head animals.csv

2012-11-05,deer,5
2012-11-05,rabbit,22
2012-11-05,raccoon,7
2012-11-06,rabbit,19
2012-11-06,deer,2
2012-11-06,fox,4
2012-11-07,rabbit,16
2012-11-07,bear,1


Why didn't we get 10 lines?

In [13]:
# get the number of lines by using an option/argument to the command 'wc'. Replace '__' with the argument.
wc -l animals.csv

8 animals.csv


### 2) Extracting columns

This section will demonstrate how you work with columnar files on the command line. This has similarities to R, but also differs in significant ways.

#### Extract the animal name.

We'll use `cut` to get a the second column of a csv, but we have to look at parameters.

##### `cut` usage message

Here is some truncated output from running `cut --help`

```
Usage: cut OPTION... [FILE]...
Print selected parts of lines from each FILE to standard output.

With no FILE, or when FILE is -, read standard input.

Mandatory arguments to long options are mandatory for short options too.
  -b, --bytes=LIST        select only these bytes
  -c, --characters=LIST   select only these characters
  -d, --delimiter=DELIM   use DELIM instead of TAB for field delimiter
  -f, --fields=LIST       select only these fields;  also print any line
                            that contains no delimiter character, unless
                            the -s option is specified
  -n                      with -b: don't split multibyte characters
      --complement        complement the set of selected bytes, characters
                            or fields
  -s, --only-delimited    do not print lines not containing delimiters
      --output-delimiter=STRING  use STRING as the output delimiter
                            the default is to use the input delimiter
  -z, --zero-terminated    line delimiter is NUL, not newline
      --help     display this help and exit
      --version  output version information and exit
```

We need to change the deliminator (*delim*) to a comma.

`-d, --delimiter=DELIM   use DELIM instead of TAB for field delimiter`

&nbsp;

Let's use the long form the the parameters, (double dashes), to make our work more explicitly readable.

&nbsp;

In [14]:
cut --delimiter=, --fields=2 animals.csv

deer
rabbit
raccoon
rabbit
deer
fox
rabbit
bear


&nbsp;

You can use the shorter form of the command for brevity. E.g. `-d` instead of `--delimiter=`

&nbsp;

In [15]:
cut -d, -f2 animals.csv

deer
rabbit
raccoon
rabbit
deer
fox
rabbit
bear


##### What questions can we answer with pipes?

Using `sort`, and `uniq`, we can see : 
- how many different days were recorded, and 
- how many different animals.

In [16]:
cut -d, -f2 animals.csv | sort | uniq -c

      1 bear
      2 deer
      1 fox
      3 rabbit
      1 raccoon


&nbsp;

Now, complete the `fields` argument to count the number of observations taken on each day.

&nbsp;

In [17]:
cut --delimiter=, --fields=1  animals.csv | sort | uniq -c

      3 2012-11-05
      3 2012-11-06
      2 2012-11-07


##### Selecting more than one column

In [22]:
# date and count
cut -f1,3 -d, animals.csv

2012-11-05,5
2012-11-05,22
2012-11-05,7
2012-11-06,19
2012-11-06,2
2012-11-06,4
2012-11-07,16
2012-11-07,1


In [19]:
# there are two more pairwise combinations: date and animal, animal and count specify them below:

# date and animal
cut --fields=1,2 --delimiter=, animals.csv

# animal and count
cut --fields=2,3 --delimiter=, animals.csv

2012-11-05,deer
2012-11-05,rabbit
2012-11-05,raccoon
2012-11-06,rabbit
2012-11-06,deer
2012-11-06,fox
2012-11-07,rabbit
2012-11-07,bear
deer,5
rabbit,22
raccoon,7
rabbit,19
deer,2
fox,4
rabbit,16
bear,1


**Things to note.**

- The order of the dash arguments doesn't matter
- output has the same delimiter as input

**Pitfalls, dealing with errors**

What happens if you forget the input file? ***animals.csv***  <br> The kernel will hang because it's waiting for input you didn't supply *(and will not be able to in jupyter notebooks)*. 

Get ready to **interupt the kernel** by: 
1. going to the menu at the top, 
2. selecting *Kernel* between *Run* and *Tabs* and then 
3. selecting the first option. *Interupt Kernel*. 

Shortcut is just typing *i,i*.

In [23]:
# animals.csv is omitted on purpose to show what happens if the command is waiting for input
# interupt the kernel to get out of it
cut -d, -f1 | sort | uniq -c




### 3) Find the mistake!

This section will give you some practice on 
- fixing errors, 
- correcting missing or incomplete syntax/commands. 
- troubleshooting arguments to commands

The objective is to get some practice troubleshooting different aspects of code.

In [27]:
cd /projects/$USER/CM515-course/modules/07_Command_Line

/scratch/alpine/haogg@colostate.edu/CM515/CM515-course-2025/modules/07_Command_Line


In [28]:
cd shell-lesson-data                         # Get to the data from module 07

In [33]:
head north-pacific-gyre/NENE0*Z.txt         # Get the first 10 lines of data files that end in Z.txt 

==> north-pacific-gyre/NENE01971Z.txt <==
0.0618278658331
7.58624853182
2.3584281401
2.59023630985
1.63700034981
0.369828482931
1.20260030361
0.116916208855
1.73621570441
0.0403725346637

==> north-pacific-gyre/NENE02040Z.txt <==
0.013313093923
0.317908185157
0.077246965355
1.08844951058
0.217470543005
0.11804743938
0.373768771019
0.888126185356
0.857964712204
1.2948587718


In [34]:
head -n 2 north-pacific-gyre/NENE01729A.txt  # Go to Kernel->Interrupt Kernel if you run this unchanged

1.03150932862
1.44755225695


In [35]:
touch newfile.txt                            # Make a new empty file

In [36]:
mkdir my_directory                       # Make a new empty directory

In [37]:
mv my_directory my_dir                  # Rename directory  

In [38]:
mv newfile.txt my_dir                     # Move newfile.txt to my_dir

In [39]:
mv my_dir/newfile.txt .                     # Move newfile.txt back to the current directory