# "Filtering file content with the `grep` command"
> Cheatsheet on the grep command in macos

- toc: true 
- badges: true
- comments: true
- categories: [cheatsheet]

> Important: The exclamation mark `!` and percent `%` symbols at the beginning of the lines should be ignored.  Indeed, this page is generated from a `jupyter notebook`, each cell of which runs `python` code.  In order to run `shell` commands (as "magic commands") one inserts a `!` or `%` at the beginning of the line.

# TODO

- `v`
- search directory recursively

# References

- [Examples using `grep`](https://www.tldp.org/LDP/Bash-Beginners-Guide/html/sect_04_02.html) (tldp.org)
- [Character classes and bracket expressions](https://www.gnu.org/software/grep/manual/html_node/Character-Classes-and-Bracket-Expressions.html) (gnu.org)
- [A large collection of Unix/Linux ‘grep’ command examples](https://alvinalexander.com/unix/edu/examples/grep.shtml)
- [grep or and not operators](https://www.thegeekstuff.com/2011/10/grep-or-and-not-operators/)
- [15 practical unix grep command examples](https://www.thegeekstuff.com/2009/03/15-practical-unix-grep-command-examples/)

- [What's the difference between `\b` and `\<` in the `grep` command?](https://unix.stackexchange.com/questions/121739/whats-the-difference-between-b-and-in-the-grep-command) (unix.stackexchange)
- [Tutorial: Find Strings in Text Files Using Grep with Regular Expressions](https://thenewstack.io/tutorial-find-strings-in-text-files-using-grep-with-regular-expressions/) (Matt Zand, thenewstack)
- [Regular Expressions In grep examples](https://www.cyberciti.biz/faq/grep-regular-expressions/) (cyberciti.biz)
- [regex quickstart](https://www.rexegg.com/regex-quickstart.html) (Rex Egg)

# Documentation from the `man` page 
Also [available here](https://ss64.com/osx/grep.html):

In [None]:
!man grep

# Sample text files
The sample text files used in this post are directly available from the OS:
- calendar files in `/usr/share/calendar`
- dictionary words in `/usr/share/dict/words`
- meaning of flowers in `/usr/share/misc/flowers`
- birth token `/usr/share/misc/birthtoken`
- the ascii table `/usr/share/misc/ascii`
- units `/usr/share/misc/units.lib`

# System information

Version of the `grep` utility:

In [1]:
!grep -V

grep (BSD grep) 2.5.1-FreeBSD


Operating system name, version, and release:

In [7]:
!uname -v

Darwin Kernel Version 19.4.0: Wed Mar  4 22:28:40 PST 2020; root:xnu-6153.101.6~15/RELEASE_X86_64


# Basic usage

## Match a string

In a single file:

In [8]:
!grep Alan /usr/share/calendar/calendar.birthday

06/07	Alan Mathison Turing died, 1954
06/23	Alan Mathison Turing born, 1912


In multiple files:

In [9]:
!grep Alan /usr/share/calendar/calendar.*

/usr/share/calendar/calendar.birthday:06/07	Alan Mathison Turing died, 1954
/usr/share/calendar/calendar.birthday:06/23	Alan Mathison Turing born, 1912
/usr/share/calendar/calendar.computer:06/07	Alan Mathison Turing died, 1954
/usr/share/calendar/calendar.computer:06/23	Alan Mathison Turing born, 1912
/usr/share/calendar/calendar.freebsd:06/06	Alan Eldridge <alane@FreeBSD.org> died in Denver, Colorado, 2003
/usr/share/calendar/calendar.history:06/28	Supreme Court decides in favor of Alan Bakke, 1978


## Insert line number: `-n`

In [10]:
!grep -n Alan /usr/share/calendar/calendar.birthday

158:06/07	Alan Mathison Turing died, 1954
168:06/23	Alan Mathison Turing born, 1912


## Highlight match: `--color`

In [11]:
!grep --color Alan /usr/share/calendar/calendar.birthday

06/07	[01;31m[KAlan[m[K Mathison Turing died, 1954
06/23	[01;31m[KAlan[m[K Mathison Turing born, 1912


## Match count: `-c`

In a single file:

In [12]:
!grep -c Alan /usr/share/calendar/calendar.birthday

2


In multiple files:

In [13]:
!grep -c Alan /usr/share/calendar/calendar.*

/usr/share/calendar/calendar.all:0
/usr/share/calendar/calendar.australia:0
/usr/share/calendar/calendar.birthday:2
/usr/share/calendar/calendar.christian:0
/usr/share/calendar/calendar.computer:2
/usr/share/calendar/calendar.croatian:0
/usr/share/calendar/calendar.dutch:0
/usr/share/calendar/calendar.freebsd:1
/usr/share/calendar/calendar.french:0
/usr/share/calendar/calendar.german:0
/usr/share/calendar/calendar.history:1
/usr/share/calendar/calendar.holiday:0
/usr/share/calendar/calendar.hungarian:0
/usr/share/calendar/calendar.judaic:0
/usr/share/calendar/calendar.lotr:0
/usr/share/calendar/calendar.music:0
/usr/share/calendar/calendar.newzealand:0
/usr/share/calendar/calendar.russian:0
/usr/share/calendar/calendar.southafrica:0
/usr/share/calendar/calendar.ukrainian:0
/usr/share/calendar/calendar.usholiday:0
/usr/share/calendar/calendar.world:0


## Case-insensitive match

In [14]:
!grep --color -i unix /usr/share/calendar/calendar.computer

01/01	The Epoch (Time 0 for [01;31m[KUNIX[m[K systems, Midnight GMT, 1970)
05/19	[01;31m[KUNIX[m[K is 10000 days old, 1997
08/14	First [01;31m[KUnix[m[K-based mallet created, 1954


## File names with a match: `-l`

In [15]:
!grep -l Alan /usr/share/calendar/calendar.*

/usr/share/calendar/calendar.birthday
/usr/share/calendar/calendar.computer
/usr/share/calendar/calendar.freebsd
/usr/share/calendar/calendar.history


## Position of match in file: `-b`

In [16]:
!grep -b Alan /usr/share/calendar/calendar.birthday

6906:06/07	Alan Mathison Turing died, 1954
7346:06/23	Alan Mathison Turing born, 1912


## Include/exclude files: `--include`, `--exclude`

Exclude file in search:

In [17]:
!grep Alan --exclude /usr/share/calendar/calendar.computer /usr/share/calendar/calendar.*

/usr/share/calendar/calendar.birthday:06/07	Alan Mathison Turing died, 1954
/usr/share/calendar/calendar.birthday:06/23	Alan Mathison Turing born, 1912
/usr/share/calendar/calendar.freebsd:06/06	Alan Eldridge <alane@FreeBSD.org> died in Denver, Colorado, 2003
/usr/share/calendar/calendar.history:06/28	Supreme Court decides in favor of Alan Bakke, 1978


Include files:

In [18]:
!grep Alan --include "calendar.*" /usr/share/calendar/*

/usr/share/calendar/calendar.birthday:06/07	Alan Mathison Turing died, 1954
/usr/share/calendar/calendar.birthday:06/23	Alan Mathison Turing born, 1912
/usr/share/calendar/calendar.computer:06/07	Alan Mathison Turing died, 1954
/usr/share/calendar/calendar.computer:06/23	Alan Mathison Turing born, 1912
/usr/share/calendar/calendar.freebsd:06/06	Alan Eldridge <alane@FreeBSD.org> died in Denver, Colorado, 2003
/usr/share/calendar/calendar.history:06/28	Supreme Court decides in favor of Alan Bakke, 1978


## Whole word match: `-w`

In [19]:
!grep -w  --color Francis /usr/share/calendar/calendar.birthday

01/22	Sir [01;31m[KFrancis[m[K Bacon born, 1561
11/20	Robert [01;31m[KFrancis[m[K Kennedy (RFK) born in Boston, Massachusetts, 1925


... as opposed to string matches:

In [20]:
!grep --color Francis /usr/share/calendar/calendar.birthday

01/22	Sir [01;31m[KFrancis[m[K Bacon born, 1561
03/30	[01;31m[KFrancis[m[Kco Jose de Goya born, 1746
04/29	William Randolph Hearst born in San [01;31m[KFrancis[m[Kco, 1863
05/30	Mel (Melvin Jerome) Blanc born in San [01;31m[KFrancis[m[Kco, 1908
11/20	Robert [01;31m[KFrancis[m[K Kennedy (RFK) born in Boston, Massachusetts, 1925


## Lines before/after/around match

Show two lines before each match:

In [21]:
!grep -n --color -B2 uncomputed /usr/share/dict/words

212942-uncomputableness
212943-uncomputably
212944:[01;31m[Kuncomputed[m[K


Show three lines after each match:

In [22]:
!grep -n --color -A3 uncomputed /usr/share/dict/words

212944:[01;31m[Kuncomputed[m[K
212945-uncomraded
212946-unconcatenated
212947-unconcatenating


Show two lines before and three lines after each match:

In [23]:
!grep -n --color -B2 -A3 uncomputed /usr/share/dict/words

212942-uncomputableness
212943-uncomputably
212944:[01;31m[Kuncomputed[m[K
212945-uncomraded
212946-unconcatenated
212947-unconcatenating


Show two lines around match:

In [24]:
!grep -n --color -C2 uncomputed /usr/share/dict/words

212942-uncomputableness
212943-uncomputably
212944:[01;31m[Kuncomputed[m[K
212945-uncomraded
212946-unconcatenated


# Regular expressions

- `?` The preceding item is optional and matched at most once
- `*` The preceding item will be matched zero or more times
- `+` The preceding item will be matched one or more times
- `{n}` The preceding item is matched exactly n times
- `{n,}` The preceding item is matched n or more times
- `{,m}` The preceding item is matched at most m times
- `{n,m}` The preceding item is matched at least n times, but not more than m times

- `\<` matches the beginning of a word
- `\>` matches the end of a word
- `\b` matches both boundaries if at the end or at the beginning

Classes of characters:  
- `[[:alnum:]]`: Alphanumeric characters.
- `[[:alpha:]]`: Alphabetic characters
- `[[:blank:]]`: Blank characters: space and tab.
- `[[:digit:]]`: Digits: ‘0 1 2 3 4 5 6 7 8 9’.
- `[[:lower:]]`: Lower-case letters: ‘a b c d e f g h i j k l m n o p q r s t u v w x y z’.
- `[[:space:]]`: Space characters: tab, newline, vertical tab, form feed, carriage return, and space.
- `[[:upper:]]`: Upper-case letters: ‘A B C D E F G H I J K L M N O P Q R S T U V W X Y Z’.


## Word anchors: `^`, `$`, `\>`, `\>`, `\b`

Lines **beginning** with pattern:

In [25]:
!grep -n --color ^compute /usr/share/dict/words

40564:[01;31m[Kcompute[m[K
40565:[01;31m[Kcompute[m[Kr


Lines **ending** with pattern:

In [26]:
!grep -n --color "compute$" /usr/share/dict/words 

40564:[01;31m[Kcompute[m[K
117000:mis[01;31m[Kcompute[m[K
164643:re[01;31m[Kcompute[m[K


Beginning a word:

In [29]:
!grep -n --colo '\<compute' /usr/share/dict/words

40564:[01;31m[Kcompute[m[K
40565:[01;31m[Kcompute[m[Kr


Ending a word:

In [30]:
!grep -n --color 'compute\>' /usr/share/dict/words

40564:[01;31m[Kcompute[m[K
117000:mis[01;31m[Kcompute[m[K
164643:re[01;31m[Kcompute[m[K


Words of specified length:

In [66]:
!grep '\<.\{24\}\>' /usr/share/dict/words

formaldehydesulphoxylate
pathologicopsychological
scientificophilosophical
tetraiodophenolphthalein
thyroparathyroidectomize


Words with fixed length and speficied starting and ending characters:

In [32]:
!grep '\<y...h\>' /usr/share/dict/words

yamph
yarth
yerth
yirth
youth


Words with specified first and last characters, of any length:

In [33]:
!grep '\<x.*y\>' /usr/share/dict/words

xanthocyanopsy
xanthocyanopy
xenagogy
xenelasy
xenogamy
xenogeny
xenophoby
xerically
xerography
xeromorphy
xerophagy
xerophily
xerophthalmy
xerophytically
xyloglyphy
xylographically
xylography
xylology
xylomancy
xylopyrography
xylotomy
xylotypography


## Boolean `OR`

In [68]:
!grep -n --color -E 'computer|hardware' /usr/share/dict/words

40565:[01;31m[Kcomputer[m[K
82436:[01;31m[Khardware[m[K
82437:[01;31m[Khardware[m[Kman


In [70]:
!grep -n --color -E 'Rose|Violet' /usr/share/misc/birthtoken

4:February:Amethyst:[01;31m[KViolet[m[K
8:June:Pearl:[01;31m[KRose[m[K


Match whole words `gray` or `grey`:

In [76]:
!grep -n --color '\<gr[ae]y\>' /usr/share/dict/words

79755:[01;31m[Kgray[m[K
79976:[01;31m[Kgrey[m[K


In [47]:
!grep -n --color '\bf[^:space:]\{0,\}t\b' /usr/share/misc/units.lib

186:[01;31m[Kfoot[m[K			12 in
187:feet			[01;31m[Kfoot[m[K
188:[01;31m[Kft			foot[m[K
189:yard			3 [01;31m[Kft[m[K
193:mile			5280 [01;31m[Kft[m[K
196:british			1200|3937 m/[01;31m[Kft[m[K
265:pood			40 [01;31m[Kfunt[m[K
266:[01;31m[Kfunt[m[K			0.40951 kg
267:lot			1|32 [01;31m[Kfunt[m[K
424:admiraltyknot		6080 [01;31m[Kft[m[K/hr
429:arpentlin		191.835 [01;31m[Kft[m[K
456:cable			720 [01;31m[Kft[m[K
467:chain			66 [01;31m[Kft[m[K
488:engineerschain		100 [01;31m[Kft[m[K
489:engineerslink		100|100 [01;31m[Kft[m[K
496:fathom			6 [01;31m[Kft[m[K
498:[01;31m[Kfifth			4|5 qt[m[K
504:[01;31m[Kfortnight[m[K		14 da
512:geodeticfoot		british-[01;31m[Kft[m[K
550:link			66|100 [01;31m[Kft[m[K
597:poundal			[01;31m[Kft[m[K-lb/sec2
617:rope			20 [01;31m[Kft[m[K
635:slug			lb-g-sec2/[01;31m[Kft[m[K
648:surveyfoot		british-[01;31m[Kft[m[K
650:surveyorschain		66 [01;31m[Kft[m[K
6

In [42]:
!grep -n --color '\bf.\{1,\}t\b' /usr/share/misc/units.lib

186:[01;31m[Kfoot[m[K			12 in
187:[01;31m[Kfeet			foot[m[K
188:[01;31m[Kft			foot[m[K
215:[01;31m[Kfloz			1|16 pt[m[K
265:pood			40 [01;31m[Kfunt[m[K
266:[01;31m[Kfunt[m[K			0.40951 kg
267:lot			1|32 [01;31m[Kfunt[m[K
314:[01;31m[Kfarad			coul/volt[m[K
496:[01;31m[Kfathom			6 ft[m[K
498:[01;31m[Kfifth			4|5 qt[m[K
503:[01;31m[Kfootlambert[m[K		cd/pi-ft2
504:[01;31m[Kfortnight[m[K		14 da
716:[01;31m[Kfrenchfoot		16|15 ft[m[K
717:[01;31m[Kfrenchfeet		frenchfoot[m[K
718:toise			6 [01;31m[Kfrenchfeet[m[K
721:militarypace		2.5 [01;31m[Kfeet[m[K


## Character classes

Match words of two characters, first an upper case, the second lower case:

In [64]:
!grep --color '\<[[:upper:]][[:lower:]]\>' /usr/share/misc/birthtoken 

May:Emerald:Lily [01;31m[KOf[m[K The Valley


Match numbers of any length:

In [62]:
!grep -n --color -E '[x]+[y]+' /usr/share/dict/words

1258:aceto[01;31m[Kxy[m[Kl
1259:aceto[01;31m[Kxy[m[Kphthalide
2168:acylo[01;31m[Kxy[m[K
2169:acylo[01;31m[Kxy[m[Kmethane
2731:adnexope[01;31m[Kxy[m[K
2805:ado[01;31m[Kxy[m[K
3774:agala[01;31m[Kxy[m[K
5413:alko[01;31m[Kxy[m[K
5414:alko[01;31m[Kxy[m[Kl
5426:alkylo[01;31m[Kxy[m[K
5736:allo[01;31m[Kxy[m[Kproteic
6715:amido[01;31m[Kxy[m[K
6716:amido[01;31m[Kxy[m[Kl
6781:amino[01;31m[Kxy[m[Klol
9152:anore[01;31m[Kxy[m[K
9202:ano[01;31m[Kxy[m[Kbiosis
9203:ano[01;31m[Kxy[m[Kbiotic
9204:ano[01;31m[Kxy[m[Kscope
9596:anthota[01;31m[Kxy[m[K
9665:anthra[01;31m[Kxy[m[Klon
9684:anthropodeo[01;31m[Kxy[m[Kcholic
10433:antio[01;31m[Kxy[m[Kgen
10434:antio[01;31m[Kxy[m[Kgenation
10435:antio[01;31m[Kxy[m[Kgenator
10436:antio[01;31m[Kxy[m[Kgenic
11589:apople[01;31m[Kxy[m[K
11702:Apo[01;31m[Kxy[m[Komenos
12091:apyre[01;31m[Kxy[m[K
12320:Araucario[01;31m[Kxy[m[Klon
13435:aro[01;31m[Kxy[m[Kl


136641:o[01;31m[Kxy[m[Kurous
136642:o[01;31m[Kxy[m[Kwelding
138193:panmi[01;31m[Kxy[m[K
138849:parado[01;31m[Kxy[m[K
139351:para[01;31m[Kxy[m[Klene
139737:paro[01;31m[Kxy[m[Ksm
139738:paro[01;31m[Kxy[m[Ksmal
139739:paro[01;31m[Kxy[m[Ksmalist
139740:paro[01;31m[Kxy[m[Ksmally
139741:paro[01;31m[Kxy[m[Ksmic
139742:paro[01;31m[Kxy[m[Ksmist
139743:paro[01;31m[Kxy[m[Ktone
139744:paro[01;31m[Kxy[m[Ktonic
139745:paro[01;31m[Kxy[m[Ktonize
141836:pentahydro[01;31m[Kxy[m[K
142816:periphra[01;31m[Kxy[m[K
142984:peritoneope[01;31m[Kxy[m[K
143215:pero[01;31m[Kxy[m[K
143216:pero[01;31m[Kxy[m[Kl
144473:phenylglyo[01;31m[Kxy[m[Klic
144737:philo[01;31m[Kxy[m[Kgenous
144789:phlebope[01;31m[Kxy[m[K
145484:photota[01;31m[Kxy[m[K
145525:photo[01;31m[Kxy[m[Klography
145844:phyllota[01;31m[Kxy[m[K
147552:pi[01;31m[Kxy[m[K
149262:pneumonope[01;31m[Kxy[m[K
149275:pneumope[01;31m[Kxy[m[K
150021:polycar

In [58]:
!grep -n --color -E '\d+' /usr/share/calendar/calendar.computer

4: * $FreeBSD: src/usr.bin/calendar/calendars/calendar.computer,v [01;31m[K1[m[K.[01;31m[K11[m[K [01;31m[K2007[m[K/[01;31m[K09[m[K/[01;31m[K07[m[K [01;31m[K03[m[K:[01;31m[K23[m[K:[01;31m[K06[m[K edwin Exp $
10:[01;31m[K01[m[K/[01;31m[K01[m[K	AT&T officially divests its local Bell companies, [01;31m[K1984[m[K
11:[01;31m[K01[m[K/[01;31m[K01[m[K	The Epoch (Time [01;31m[K0[m[K for UNIX systems, Midnight GMT, [01;31m[K1970[m[K)
12:[01;31m[K01[m[K/[01;31m[K03[m[K	Apple Computer founded, [01;31m[K1977[m[K
13:[01;31m[K01[m[K/[01;31m[K08[m[K	American Telephone and Telegraph loses antitrust case, [01;31m[K1982[m[K
14:[01;31m[K01[m[K/[01;31m[K08[m[K	Herman Hollerith patents first data processing computer, [01;31m[K1889[m[K
15:[01;31m[K01[m[K/[01;31m[K08[m[K	Justice Dept. drops IBM suit, [01;31m[K1982[m[K
16:[01;31m[K01[m[K/[01;31m[K10[m[K	First CDC [01;31m[K1604[m[K deliv