# _Find_ command

Let's explain use of command _find_. It has lots of options.

##  Examples

* Find all files with a name exactly equal to Finn.txt

In [52]:
% cd ~/Data

! find -name "Finn.txt"

/home/dsc/Data
./shell/Finn.txt


* Find all files with extension _.txt_. Note than an extension is just a part of the file name. We use wildcard (*).

In [53]:
% cd ~/Data

! find -name "*.txt"

/home/dsc/Data
./us_dot/otp/last_20.txt
./shell/con_lineas.txt
./shell/Text_4sed.txt
./shell/Finn.txt
./shell/Text_example.txt
./shell/Text_twice.txt


* Find all files with extension _.txt_ inside shell. 

In [54]:
% cd ~/Data

! find ./shell -name "*.txt"

/home/dsc/Data
./shell/con_lineas.txt
./shell/Text_4sed.txt
./shell/Finn.txt
./shell/Text_example.txt
./shell/Text_twice.txt


* Find case-insensitive

In [55]:
% cd ~/Data

! find -iname "*.tXT" 

/home/dsc/Data
./us_dot/otp/last_20.txt
./shell/con_lineas.txt
./shell/Text_4sed.txt
./shell/Finn.txt
./shell/Text_example.txt
./shell/Text_twice.txt


* Find files greater than 1000 KB

In [56]:
% cd ~/Data

! find -size +1000k

/home/dsc/Data
./challenge/searches.csv.bz2
./challenge/bookings.csv.bz2
./airline_tickets/data.tar.bz2
./airline_tickets/sales_segments.csv.gz
./us_dot/otp/On_Time_On_Time_Performance_2015_2.zip
./us_dot/otp/On_Time_On_Time_Performance_2015_8.zip
./us_dot/otp/On_Time_On_Time_Performance_2015_4.zip
./us_dot/otp/On_Time_On_Time_Performance_2015_3.zip
./us_dot/otp/On_Time_On_Time_Performance_2015_7.zip
./us_dot/otp/On_Time_On_Time_Performance_2015_1.zip
./us_dot/otp/On_Time_On_Time_Performance_2015_6.zip
./us_dot/otp/On_Time_On_Time_Performance_2015_5.zip
./us_dot/traffic/T100_SEGMENT_ALL_CARRIER_2015.zip
./us_dot/traffic/T100_SEGMENT_ALL_CARRIER_2014.zip
./opentraveldata/optd_por_public.csv
./opentraveldata/opentravel.gz.tar


* Find files lower than 1000 KB

In [57]:
% cd ~/Data

! find -size -1000k

/home/dsc/Data
.
./challenge
./airline_tickets
./README.md
./us_dot
./us_dot/otp
./us_dot/otp/last_20.txt
./us_dot/traffic
./shell
./shell/zippped.zip
./shell/pruebas_sql
./shell/con_lineas.txt
./shell/Text_4sed.txt
./shell/Finn.txt
./shell/miscolumnas.vsc
./shell/Text_example.txt
./shell/Text_twice.txt
./shell/numbers
./shell/column_name_number.sh
./opentraveldata
./opentraveldata/optd_airlines.csv
./opentraveldata/Nofokker.csv
./opentraveldata/fokker.csv
./opentraveldata/optd_aircraft_con_comas_con_csvformat.csv
./opentraveldata/ref_airline_nb_of_flights.csv
./opentraveldata/optd_cabecera
./opentraveldata/optd_aircraft.sql
./opentraveldata/optd_aircraft.csv
./opentraveldata/prueba
./opentraveldata/optd_aircraft_comma.csv


* Find ONLY files lower than 1000 KB

In [58]:
% cd ~/Data

! find -type f -size -1000k

/home/dsc/Data
./README.md
./us_dot/otp/last_20.txt
./shell/zippped.zip
./shell/con_lineas.txt
./shell/Text_4sed.txt
./shell/Finn.txt
./shell/miscolumnas.vsc
./shell/Text_example.txt
./shell/Text_twice.txt
./shell/numbers
./shell/column_name_number.sh
./opentraveldata/optd_airlines.csv
./opentraveldata/Nofokker.csv
./opentraveldata/fokker.csv
./opentraveldata/optd_aircraft_con_comas_con_csvformat.csv
./opentraveldata/ref_airline_nb_of_flights.csv
./opentraveldata/optd_cabecera
./opentraveldata/optd_aircraft.sql
./opentraveldata/optd_aircraft.csv
./opentraveldata/optd_aircraft_comma.csv


* Find only folders

In [59]:
% cd ~/Data

! find -type d -name "*op*"

/home/dsc/Data
./opentraveldata


* Use -not to invert the match. Note that -not only affects to the next filter

In [60]:
% cd ~/Data

! find -not -name "*op*" -type d 

/home/dsc/Data
.
./challenge
./airline_tickets
./us_dot
./us_dot/otp
./us_dot/traffic
./shell
./shell/pruebas_sql
./opentraveldata/prueba


* Find according permissions. Example: find files with all the permissions. For help, execute ```man chmod``` from shell

In [61]:
% cd ~/Data

! find -perm 777

/home/dsc/Data
./shell/Text_example.txt
./shell/column_name_number.sh


* Find only in this folder. Use -maxdepth.  b

In [62]:
% cd ~/Data

! find  -maxdepth 1  -type d -name "*e*"


/home/dsc/Data
./challenge
./airline_tickets
./shell
./opentraveldata


* Execute an action for each found file. Example, count lines

In [63]:
% cd ~/Data

! find -name "*.txt" -exec wc -l {} \;

/home/dsc/Data
20 ./us_dot/otp/last_20.txt
7 ./shell/con_lineas.txt
7 ./shell/Text_4sed.txt
12363 ./shell/Finn.txt
7 ./shell/Text_example.txt
14 ./shell/Text_twice.txt


Note that file name is represented by {}. In addition, we must add "\;" to the end of the command

* If finish the command with _+_ instead of _\n_, the command is executed once with the aggregated result

In [64]:
% cd ~/Data

! find -name "*.txt" -exec wc -l {} +

/home/dsc/Data
    20 ./us_dot/otp/last_20.txt
     7 ./shell/con_lineas.txt
     7 ./shell/Text_4sed.txt
 12363 ./shell/Finn.txt
     7 ./shell/Text_example.txt
    14 ./shell/Text_twice.txt
 12418 total


## Exercises

* Find all files located ONLY inside subdirectories of your home directory which have been modified in last 2min

In [65]:
% cd

! find -mindepth 2 -mmin -2

/home/dsc
./Repos/linux-shell-jupyter-notebooks
./Repos/linux-shell-jupyter-notebooks/004_Find_command.ipynb
./Repos/linux-shell-jupyter-notebooks/.ipynb_checkpoints/004_Find_command-checkpoint.ipynb
./.config/libreoffice/4/user
./.config/libreoffice/4/user/registrymodifications.xcu
./.config/google-chrome
./.config/google-chrome/BrowserMetrics/BrowserMetrics-5C0BA023-622.pma
./.config/google-chrome/Local State
./.config/google-chrome/Default
./.config/google-chrome/Default/QuotaManager-journal
./.config/google-chrome/Default/Shortcuts-journal
./.config/google-chrome/Default/TransportSecurity
./.config/google-chrome/Default/IndexedDB/https_www.google.es_0.indexeddb.leveldb
./.config/google-chrome/Default/IndexedDB/https_www.google.es_0.indexeddb.leveldb/LOG
./.config/google-chrome/Default/IndexedDB/https_www.google.es_0.indexeddb.leveldb/000003.log
./.config/google-chrome/Default/Current Session
./.config/google-chrome/Default/Local Extension Settings/cfhdojbkjhnklbpkdaibdccddilifddb/0

* Find all empty files inside DIIRECT subdirectories of your home directory which do NOT have read-write-execute permissions given to all users. *I have changed the exercise in order to mionimize the results*. 

In [66]:
% cd

! find -mindepth 2 -maxdepth 2 -not -perm 777 -empty -type f -exec ls -l {} \;

/home/dsc
-rw-r--r-- 1 dsc dsc 0 nov 24 10:24 ./vacio/fichero.txt
-rw-rw-rw- 1 dsc dsc 0 dic  6 20:44 ./basura/asdsa.txt
-rw-r--r-- 1 dsc dsc 0 dic  7 00:10 ./Downloads/.Rhistory
-rw-r--r-- 1 dsc dsc 0 mar 12  2018 ./anaconda3/vscode_inst.py.log
-rw-r--r-- 1 dsc dsc 0 nov 24 10:52 ./one/holaConMismosPermisos.txt
-rw-r--r-- 1 dsc dsc 0 nov 24 10:52 ./one/holaNuevoNombre.txt
-rw-rw-r-- 1 dsc dsc 0 mar 12  2018 ./.rstudio-desktop/history_database


* Expand previous command to grant these permissions using “ok” option.

In this case the command would be 

```shell
find -mindepth 2 -maxdepth 2 -not -perm 777 -empty -type f -ok chmod 777 {} \;
```

With ok option, a confirmation (Y/N) is required for each found file. 

* Get top 3 largest files per subdirectory inside ~/Data/ . *Important*: in this exercise we chain commands. It is important to know it. 

In [67]:
% cd ~/Data


! find ~/Data/ -type d -exec echo "- Three largest files of " {} \; -exec sh -c "ls -S {} | head -3 " \;


/home/dsc/Data
- Three largest files of  /home/dsc/Data/
airline_tickets
challenge
opentraveldata
- Three largest files of  /home/dsc/Data/challenge
bookings.csv.bz2
searches.csv.bz2
- Three largest files of  /home/dsc/Data/airline_tickets
sales_segments.csv.gz
data.tar.bz2
- Three largest files of  /home/dsc/Data/us_dot
otp
traffic
- Three largest files of  /home/dsc/Data/us_dot/otp
On_Time_On_Time_Performance_2015_7.zip
On_Time_On_Time_Performance_2015_6.zip
On_Time_On_Time_Performance_2015_8.zip
- Three largest files of  /home/dsc/Data/us_dot/traffic
T100_SEGMENT_ALL_CARRIER_2014.zip
T100_SEGMENT_ALL_CARRIER_2015.zip
- Three largest files of  /home/dsc/Data/shell
Finn.txt
zippped.zip
pruebas_sql
- Three largest files of  /home/dsc/Data/shell/pruebas_sql
- Three largest files of  /home/dsc/Data/opentraveldata
optd_por_public.csv
opentravel.gz.tar
optd_airlines.csv
- Three largest files of  /home/dsc/Data/opentraveldata/prueba


Sources:
https://www.servidoresadmin.com/comando-find-en-linux-shell-script/

<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
<div style="text-align:right">
Juan Luis García López (@huanlui)
<a href="https://github.com/huanlui" class="fa fa-github"> Github </a>
<a href="https://twitter.com/huanlui" class="fa fa-twitter"> Twitter </a>
<a href="https://www.linkedin.com/in/juan-luis-garcía-lópez-99057138" class="fa fa-linkedin"> Linkedin </a>
<div>
