/
03-file-management.Rmd
452 lines (334 loc) · 14.6 KB
/
03-file-management.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
# File Management {#command-line-file-management}
In this chapter, we will explore commands for file management including:
- create new file/change timestamps
- copying files
- renaming/moving files
- deleting files
- comparing files
```{r table_file_manage, echo=FALSE}
cname <- c("`touch`", "`cp`", "`mv`", "`rm`", "`diff`")
descrip <- c("Create empty file(s)/change timestamp",
"Copy files & folders",
"Rename/move file",
"Remove/delete file",
"Compare files")
data.frame(Command = cname, Description = descrip) %>%
kable() %>%
kable_styling(
bootstrap_options = c("striped", "hover", "condensed", "responsive")
)
```
## Create new file
`touch` modifies file timestamps which is information associated with file modification. It can be any of the following:
- access time (the last time the file was read)
- modification time (the last time the contents of the file was changed)
- change time (the last time the file's metadata was changed)
If the file does not exist, it will create an empty file of the same name. Let us use `touch` to create a new file `myanalysis.R`.
```{bash c17a, eval=FALSE}
touch myanalysis.R
ls
```
```{bash c17b, echo=FALSE}
cd cline
touch myanalysis.R
ls
```
## Copy Files/Folders
`cp` makes copies of files and directories. The general form of the command is
**cp source destination**. By default, it will overwrite files without prompting for confirmation so be cautious while copying files or folders.
### Copy files in same folder
Let us create a copy of `release_names.txt` file and name it as `release_names_2.txt`.
```{bash c19a, eval=FALSE}
cp release_names.txt release_names_2.txt
ls
```
```{bash c19b, echo=FALSE}
cd cline
cp release_names.txt release_names_2.txt
ls
```
### Copy files into different folder
To copy a file into a different directory/folder, we need to specify the name of the destination folder. If the copied file should have a different name, then we need to specify the new name of the file as well. Let us copy the `release_names.txt` file into the `r_releases` folder (we will retain the same name for the file as we are copying it into a different folder).
```{bash c21a, eval=FALSE}
cp release_names.txt r_releases/release_names.txt
```
```{bash c117a, echo=FALSE}
cd cline
mkdir r_releases
cp release_names.txt release_names_3.txt
```
```{bash c21b, echo=FALSE}
cd cline
cp release_names.txt r_releases/release_names.txt
```
Let us check if the file has been copied by listing the files in the `r_releases` folder using `ls`.
```{bash c20a, eval=FALSE}
ls r_releases
```
```{bash c20b, echo=FALSE}
cd cline
ls r_releases
```
### Copy folders
How about making copies of folders? Use the `-r` option to copy entire folders. Let us create a copy of the `r` folder and name it as `r2`. The `-r` option stands for `--recursive` i.e. copy directories recursively.
```{bash c22a, eval=FALSE}
cp -r r r2
ls
```
```{bash c22b, echo=FALSE}
cd cline
cp -r r r2
ls
```
## Move/Rename Files
`mv` moves and renames files and directories. Using different options, we can ensure
- files are not overwritten
- user is prompted for confirmation before overwriting files
- details of files being moved is displayed
```{r table_mv, echo=FALSE}
cname <- c("`mv`", "`mv -f`", "`mv -i`", "`mv -n`", "`mv -v`")
descrip <- c("Move or rename files/directories",
"Do not prompt for confirmation before overwriting files",
"Prompt for confirmation before overwriting files",
"Do not overwrite existing files",
"Move files in verbose mode")
data.frame(Command = cname, Description = descrip) %>%
kable() %>%
kable_styling(
bootstrap_options = c("striped", "hover", "condensed", "responsive")
)
```
Let us move the `release_names_2.txt` file to the `r_releases` folder.
```{bash c23a, eval=FALSE}
mv release_names_2.txt r_releases
```
```{bash c23b, echo=FALSE}
cd cline
mv release_names_2.txt r_releases
```
Use `ls` to verfiy if the file has been moved. As you can see, `release_names_2.txt` is not present in the current working directory.
```{bash c23c, eval=FALSE}
ls
```
```{bash c23d, echo=FALSE}
cd cline
ls
```
Let us check if `release_names_2.txt` is present in the `r_releases` folder. Great! We have successfully moved the file into a different folder.
```{bash c23e, eval=FALSE}
ls r_releases
```
```{bash c23f, echo=FALSE}
cd cline
ls r_releases
```
### Move files in verbose mode
To view the details of the files being moved/renamed, use the `-v` option. In the below example, we move the `release_names_3.txt` file into the `r_releases` folder using `mv`.
```{bash c24a, eval =FALSE}
mv -v release_names_3.txt r_releases
```
```{bash c24b, echo=FALSE}
cd cline
mv -v release_names_3.txt r_releases
```
### Do not overwrite existing files
How do we ensure that files are not overwritten without prompting the user first? In the below example, we will try to overwrite the `release_names_2.txt` in the `r_releases` folder using `mv` and see what happens. But first, let us look at the contents of the `release_names_2.txt` file using the `cat` command.
We will look into the `cat` command in more detail in the next chapter but for the time being it is sufficient to know that it prints contents of a file. The file contains release names of different R versions.
```{bash c77a, eval =FALSE}
cat r_releases/release_names_2.txt
```
```{bash c77b, echo=FALSE}
cd cline
cat r_releases/release_names_2.txt
```
In our current working directory, we will create another file of the same name i.e. `release_names_2.txt` but its contents are different from the file in the `r_releases` folder. It contains the string `release_names` and nothing else. We will now move this file into the `r_releases` folder but use the option `-n` to ensure that the file in the `r_releases` folder is not overwritten. We can confirm this by printing the contents of the file in the `r_releases` folder.
The `echo` command is used to print text to the terminal or to write to a file. We will explore it in more detail in the next chapter.
```{bash c78a, eval =FALSE}
echo "release_names" > release_names_2.txt
mv -n release_names_2.txt r_releases
cat r_releases/release_names_2.txt
```
```{bash c78b, echo=FALSE}
cd cline
echo "release_names" > release_names_2.txt
mv -n release_names_2.txt r_releases
cat r_releases/release_names_2.txt
```
As you can observe, the contents of the file in the `r_releases` folder has not changed. In the next section, we will learn to overwrite the contents using the `-f` option.
### Do not prompt for confirmation before overwriting files
What if we actually intend to overwrite a file and do not want to be prompted for confirming the same. In this case, we can use the `-f` option which stands for `--force` i.e. do not prompt before overwriting. Let us first print the contents of the `release_names_2.txt` file in the `r_releases` folder.
```{bash c79a, eval =FALSE}
cat r_releases/release_names_2.txt
```
```{bash c79b, echo=FALSE}
cd cline
cat r_releases/release_names_2.txt
```
Now we will create another file of the same name in the current working directory but with different content and use the `-f` option to overwrite the file in the `r_releases` folder. You can see that the contents of the file in the `r_releases` folder has changed.
```{bash c80a, eval =FALSE}
echo "release_names" > release_names_2.txt
mv -f release_names_2.txt r_releases
cat r_releases/release_names_2.txt
```
```{bash c80b, echo=FALSE}
cd cline
echo "release_names" > release_names_2.txt
mv -f release_names_2.txt r_releases
cat r_releases/release_names_2.txt
```
## Remove/Delete Files
The `rm` command is used to delete/remove files & folders. Using additional options, we can
- remove directories & sub-directories
- forcibly remove directories
- interactively remove multiple files
- display information about files removed/deleted
```{r table_rm, echo=FALSE}
cname <- c("`rm`", "`rm -r`", "`rm -rf`", "`rm -i`", "`rm -v`")
descrip <- c("Remove files/directories",
"Recursively remove a directory & all its subdirectories",
"Forcibly remove directory without prompting for confirmation or showing error messages",
"Interactively remove multiple files, with a prompt before every removal",
"Remove files in verbose mode, printing a message for each removed file")
data.frame(Command = cname, Description = descrip) %>%
kable() %>%
kable_styling(
bootstrap_options = c("striped", "hover", "condensed", "responsive")
)
```
### Remove files
Let us use `rm` to remove the file `myanalysis.R` (we created it earlier using the `touch` command).
```{bash c25a, eval=FALSE}
rm myanalysis.R
ls
```
```{bash c25b, echo=FALSE}
cd cline
rm myanalysis.R
ls
```
### Recursive Deletion
How about folders or directories? We can remove a directory and all its contents including sub-directories using the option `-r` which stands for `--recursive` and removes directories and their contents recursively. Let us remove the `myproject1` folder and all its contents.
```{bash c26a, eval=FALSE}
rm -r myproject1
ls
```
```{bash c26b, echo=FALSE}
cd cline
rm -r myproject1
ls
```
### Force Removal
Use the `-f` option which stands for `--force` to forciby remove directory and all its contents without prompting for confirmation or showing error messages. Let us remove the `myproject2` folder and all its contents.
```{bash c81a, eval=FALSE}
rm -rf myproject2
ls
```
```{bash c81b, echo=FALSE}
cd cline
rm -rf myproject2
ls
```
### Verbose Mode
Remove files in verbose mode, printing a message for each removed file. This is useful when you want to see the details of the files being removed. In the below example, we will remove all files with `.txt` extension from the `myfiles` folder. Instead of specifying the name of each text file, we use the wildcard `*` along with `.txt` i.e. any file with the extension `.txt` will be removed.
```{bash c27a, eval=FALSE}
cd myfiles
rm -v *.txt
```
```{bash c27b, echo=FALSE}
cd cline
cd myfiles
rm -v *.txt
```
## Compare Files
`diff` stands for difference. It is used to compare files line by line and display differences. It also indicates which lines in one file must be changed to make the files identical. Using additional options, we can
- ignore white spaces while comparing files
- show differences sidy by side
- show differences in unified format
- compare directories recursively
- display names of files that differ
```{r table_diff, echo=FALSE}
cname <- c("`diff`", "`diff -w`", "`diff -y`", "`diff -u`", "`diff -r`", "`diff -rq`")
descrip <- c("Compare files & directories",
"Compare files; ignoring white spaces",
"Compare files; showing differences side by side",
"Compare files; show differences in unified format",
"Compare directories recursively",
"Compare directories; show the names of files that differ")
data.frame(Command = cname, Description = descrip) %>%
kable() %>%
kable_styling(
bootstrap_options = c("striped", "hover", "condensed", "responsive")
)
```
### Compare Files
Let us compare the contents of the following files
- `imports_olsrr.txt`
- `imports_blorr.txt`
The files contain the names of R packages imported by the [olsrr](https://olsrr.rsquaredacademy.com/) and [blorr](https://blorr.rsquaredacademy.com/) packages respectively (**Full disclosure: both the above R pakages are developed by Rsquared Academy.**).
`diff` uses certain special symbols and gives instructions to make the files identical. The instructions are on how to change the first file to make it identical to the second file. We list the symbols below
- **a** for add
- **c** for change
- **d** for delete
We will use the `-w` option to ignore white spaces while comparing the files.
```{bash c30, eval=FALSE}
diff -w imports_olsrr.txt imports_blorr.txt
```
```{r r30, echo=FALSE, eval=FALSE}
suppressWarnings(cat(system2('diff', c("-w", "imports_olsrr.txt", "imports_blorr.txt"), TRUE), sep = "\n"))
```
Let us interpret the results. `4a5` indicates **after line 4 in file 1, add line 5 from file 2** to make both the files identical i.e. add `caret` which is line 5 in `imports_blorr.txt` after line 4 in `imports_olsrr.txt` which will make both the files identical.
Let us change the file order and see the instructions from `diff`.
```{bash c30a, eval=FALSE}
diff -w imports_blorr.txt imports_olsrr.txt
```
```{r r30a, echo=FALSE, eval=FALSE}
suppressWarnings(cat(system2('diff', c("-w", "imports_blorr.txt", "imports_olsrr.txt"), TRUE), sep = "\n"))
```
`5d4` indicates **delete line 5 from file 1 to match both the files at line4** i.e. delete `caret` which is line 5 in `imports_blorr.txt` to make both the files identical.
### Side By Side
To view the differences between the files side by side, use the `-y` option.
```{bash c31, eval=FALSE}
diff -y imports_olsrr.txt imports_blorr.txt
```
```{r r31, echo=FALSE, eval=FALSE}
suppressWarnings(cat(system2('diff', c("-y", "imports_olsrr.txt", "imports_blorr.txt"), TRUE), sep = "\n"))
```
### Unified Format
To view the differences between the files in a unified format, use the `-u` option.
```{bash c32, eval=FALSE}
diff -u imports_olsrr.txt imports_blorr.txt
```
```{r r32, echo=FALSE, eval=FALSE}
suppressWarnings(cat(system2('diff', c("-u", "imports_olsrr.txt", "imports_blorr.txt"), TRUE), sep = "\n"))
```
### Compare Recursively
To compare recursively, use the `-r ` option. Let us compare the `mypackage` and `myproject` folders.
```{bash c82, eval=FALSE}
diff -r mypackage myproject
```
```{r r82, echo=FALSE, eval=FALSE}
suppressWarnings(cat(system2('diff', c("-r", "mypackage", "myproject"), TRUE), sep = "\n"))
```
### File Details
To compare directories and view the names of files that differ, use the `-rq` option. In the below example, we look at the names of files that differ in `mypackage` and `myproject` folders.
```{bash c83, eval=FALSE}
diff -rq mypackage myproject
```
```{r r83, echo=FALSE, eval=FALSE}
suppressWarnings(cat(system2('diff', c("-rq", "mypackage", "myproject"), TRUE), sep = "\n"))
```
## R Functions
In R, file operations can be performed using functions from both base R and the [fs](https://fs.r-lib.org/index.html) package.
```{r r_file_manage, echo=FALSE}
cname <- c("`touch`", "`cp`", "`mv`", "`rm`", "`diff`")
descrip <- c("`file.create()` / `fs::file_create()` / `fs::file_touch()`",
"`file.copy()` / `fs::file_copy()` / `fs::dir_copy()`",
"`file.rename()` / `fs::file_move()`",
"`file.remove()` / `fs::file_delete()`",
"")
data.frame(Command = cname, R = descrip) %>%
kable() %>%
kable_styling(
bootstrap_options = c("striped", "hover", "condensed", "responsive")
)
```