# Some More Powerful Tools
------------------------------------------------------------------------------------------------------------------------

### References

Ephemeris data were downloaded from the Jet Propulsion Laboratory's _HORIZONS_ System:

> Jet Propulsion Laboratory (2019). "Horizons On-Line Ephemeris System." Accessed 2019-10-25 from https://ssd.jpl.nasa.gov/horizons.cgihorizons_doc.

For more info or help with any command, use the help flag.

> find --help

> grep --help

> curl --help

## find

The __find__ command searches for files by name recursively through directories, beginning from a specified starting point. A very simple usage in our case is to find text files in our sample data folder

> find . -iname '*.txt'

In [6]:
%%bash
# list the contents of the present working directory
find .

FIND: Parameter format not correct


In [7]:
%%bash
# list the subdirectories in the current directory and subdirectories
find . -type d

FIND: Parameter format not correct


In [14]:
%%bash
# list the files in the current directory and subdirectories
find . -type f

FIND: Parameter format not correct


In [15]:
%%bash
# list the plain text files in the current directory and subdirectories
find . -name *.txt

FIND: Parameter format not correct


In [None]:
%%bash
# same as above, only case insensitive
find . -iname *.TXT

In our sample data directory, we have the top ten downloads from Project Gutenburg as well as ephemerides for the plants in the solar system (and Pluto). Our previous search found all of them but we can search for specific terms in the filenames.

In [16]:
%%bash
# same as above, only case insensitive
find . -iname *ephemeris*

File not found - *ephemeris*


## grep

It's also possible to search within the content of files using __grep__. In the example below, we want to search for the word 'mars' in all of the Jupyter Notebooks in the current directory and any subdirectories. In this example, the _-i_ flag allows case-insensitive search. The _-r_ flag makes the search recursive, and the _--include_ flag allows us to use a wildcard to filter on filenames. If only searching within a single file, the _-include_ flag is not needed, only the filename.

> grep -i -r 'mars'

This produces a lot of results - many of our Project Gutenberg texts also include the word 'Mars' or 'mars.' To search just on ephemerides, we can use the _--include_ flag:

> grep -i -r 'Mars' --include="*ephemeris*"

## curl

cURL is a data transfer utility which supports numerous protocols, including http(s), (s)ftp, and ssh.

> curl https://www.usconstitution.net/const.txt

> curl wttr.in/Albuquerque

> curl wttr.in/Moon

In [10]:
%%bash
#curl wttr.in/Moon



## Using pipes

Pipes allow us to pass the output of one command as input for another command. For example, instead of using _find_ to manage files as above, we could have used a combination of _ls_ and _grep_:

> ls -R . | grep -i "untitled"

> curl https://www.usconstitution.net/const.txt | grep --context=2 -i "peaceably to assemble"

> curl https://www.usconstitution.net/const.txt | grep --context=2 -i "peaceably to assemble" > bill.txt | more bill.txt

In [11]:
%%bash
ls -R ../../ | grep -i "untitled"

bash: line 1: ls: command not found


In [12]:
%%bash
curl https://www.usconstitution.net/const.txt | grep --context=2 -i "peaceably to assemble"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (77) error setting certificate verify locations:
  CAfile: C:/ProgramData/Anaconda3/Library/mingw-w64/ssl/certs/ca-bundle.crt
  CApath: none


In [13]:
%%bash
curl https://www.usconstitution.net/const.txt | grep --context=2 -i "peaceably to assemble" > bill.txt #| more bill.txt

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (77) error setting certificate verify locations:
  CAfile: C:/ProgramData/Anaconda3/Library/mingw-w64/ssl/certs/ca-bundle.crt
  CApath: none
