-
Notifications
You must be signed in to change notification settings - Fork 0
/
LinuxPres.Rpres
289 lines (199 loc) · 8.34 KB
/
LinuxPres.Rpres
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
LinuxPres
========================================================
author:
date:
```{r, include = FALSE}
library(knitr)
opts_chunk$set(eval = FALSE)
opts_chunk$set(engine = "bash")
```
What is Linux
========================================================
- Just another operating system.
- Created by Linus Torvalds ("Lin" as in Linus)
- based on Unix OS ("ux" as in Unix)
- Open source/free
- Widely used on servers
- Has graphical interface, but often used mainly from CLI
- This workshop will focus on CLI (Bash)
Why might I need Linux?
========================================================
- Interacting with computing cluster (e.g., AWS, GHPCC)
- Using software only available for Linux
- Economy (it's free!)
- Hacker street cred
Basics - file structure
========================================================
Log into your AWS instance, and try the command below.
```{r, eval=FALSE}
pwd
```
This stands for "Print Working Directory," it gives the name of the folder (directory in Linux jargon) you are currently in.
Basics - file structure
========================================================
What directories exist by default?
- The root directory, known only as /
- This is where everything on the computer is stored
- /tmp, for temporary files. The contents of this directory is deleted after a reboot!
- /usr, programs installed here, often in subdirectories.
All of these begin with "/", because they are subdirectories of root.
Unless you are installing something, you won't have to worry about these.
Basics - file structure
========================================================
Your **home folder** is where most of your work is done. In here, you can create files, edit and execute them without Linux complaining.
Just like **/** is the shorthand for root, **~** means your home directory.
Navigating
========================================================
You can move around between these directories using the **cd** (as in "change directory") command. Try this:
```{r, eval=FALSE}
cd /usr/bin
```
The argument for this command "/usr/bin" is called a **path**. Paths are the location directories or files on the computer. This means we are in **bin** inside **/usr**.
To see what's in here use this command:
```{r, eval=FALSE}
ls
```
You will see a list of all files and subdirectories in /usr/bin.
Go home
=====
To navigate back to your home folder, you can do
```{r}
cd ~
```
but just `cd` with nothing after it works too.
**~** is a special nickname for the home directory. Other special nicknames include
- **..** for the parent directory
- **.** for the current directory
- **-** for the previous directory
Section 2: file manipulation
===============
- Making files
- moving, renaming, removing
- permissions
wget
=========
The `wget` command pulls a file from a server (like on the internet)
```{r}
wget
```
The following URL links to a csv file with US Geological Survey parameter codes: http://bit.ly/2eA7j2p
To download it to your current working directory, use `wget http://bit.ly/2eA7j2p`
File permissions
=====
Files can be *read*, *written*, and/or *executed*. Not every user can do all these things to every file.
To see the permissions for a file, type `ls -l [file name]`.
```{r, engine = "bash"}
ls -l
```
The `-l` is an option given to the `ls` command, telling it to list files in *long* format, i.e. with more information. These options (flags) are associated with many different commands and options.
Changing file permissions
=======
The **chmod** function sets permissions for a specified file.
- There are 3 permissions (read, write, execute)
- encoded as 3-digit binary converted to integer, for example:
- (*read* and *write* and *execute*) = (111) = 7
- (*not read* and *write* and *not execute*) = (010) = 2
- There are 3 user categories that can have permissions: owner, group, all
- Smash the 3 permission integers together, for example:
- `chmod 777 [filename]` gives everyone read, write, and execute permissions
- `chmod 744 [filename]` gives the owner owner can read, write, and execute permissions, but everyone else can only read.
Making files--3 ways
=====
1. You can make an empty file using the `touch` command. For example, `touch foo.csv` would make an empty file with the name "foo.csv"
2. You can also create a file and fill it at the same time using the `>` operator `echo hi there > hi.txt` creates a file called `hi.txt` with the text "hi there"
- `>` is called a **redirect**. It redirects output of a function to a file.
3. Use a built-in command-line text editor. **vi** is a popular one, but it's not easy to use. **nano** is much easier. `nano foo.csv` allows you to edit the file "foo.csv".
Task
=====
- Create a file called changeprompt.sh
- change the permissions to make it executable
- Using `nano`, put the following text in the file:
```{r}
#! /bin/bash
export OLDPS=$PS1
export PS1="GRiD_"$OLDPS
```
- Execute the file by running `sh changeprompt.sh`
Your command prompt should now have "GRiD_" at the beginning.
- Change it back by running `export PS1=$OLDPS`
Other file operations
======
- **cp** copy a file: `cp [file name] [copy name]`
- **mv** move a file: `mv [file name] [copy name]`
- **rm** remove a file: ` rm [file name]`
- **mkdir** make a new directory: ` mkdir [directory name]`
Section 3: Probing files, folders, and contents
=======
- **find** to look for files with a particular name
- **grep** to look for text that matches a pattern
- **head** to print first few lines of a file
- **tail** same, for last few lines
- **cat** to print the entire file contents
The pipe operator
=======
You'll often need to chain commands together, "piping" the output of one into another as input. This is accomplished using the vertical bar symbol, **|**
- look for the string "needle" in the file "haystack.txt"
- `cat haystack.txt | grep needle`
- sort all occurrences of the string "needle" alphabetically by the first character in the line
- `cat haystack.txt | grep needle | sort`
Task
======
Find all the lines in the file **usgsCodes.csv** that contain the text "Trihalo".
Section 4: Packages and processes
========
Running a program
======
We can run Python from the command line, since it comes preinstalled with our AWS instance.
```{r}
python
```
should bring up a new and different command prompt. You are now in a python console and your bash commands won't work. But Python code will.
```{r}
print 'hello!'
3 + 4 * 6
```
`quit()` will return you to the bash terminal.
The `sudo` (super-user do) command
=======
Linux doesn't let just anybody play around with its guts. Only someone with root access, a "super user" can do things like install new packages or delete unwanted ones.
- Playing around with Linux guts is dangerous and can harm your machine.
- But you can run a command as a super user by prepending it with `sudo`.
```{r}
touch /this # doesn't work
sudo touch /this # works
```
Getting a new package using apt-get
========
Linux programs (*packages*) can be obtained using a package management tool. For Ubuntu (the linux distribution you're using), the package manager is called **apt**. The two most important commands are **apt-get install** and **apt-get update**.
Try obtaining R using `apt-get install r-base`. Does it work? How can you make it work?
Terminating/killing runaway programs
=====
- `top` to see top resource-using processes
- `ps -aux` to see all running processes
- pipe into `grep` to get process ID of the process you want to kill
- `kill [process ID]` will kill the process with a specified ID
Process termination exercise
=====
1. `wget` the following file
2. `mv` it to have the name "blabber.sh"
3. `chmod` permissions so you can execute it.
4. Run it using `./blabber.sh &` (note the ampersand)
Process termination exercise
=====
You should now have a file called "blah.txt". Monitor its contents with `tail -f blah.txt`.
- Your runaway process, blabber.sh, is writing text in an infinite loop!
Learning more
=====
Function documentation
- Manual pages, `man [function]`
- often verbose
- help flag, often `-h` or `--help`
- e.g. `python -h`
- Google, stackexchange
Other functions and things to learn about on your own
====
|function|what it does|
|-------|----------|
| sed | text manipulation |
| regex | computer language for matching strings of text |
| rsync | intelligent file transfer/syncing|