## glob tutorial
**Shizhe Cai**
This tutorial introduces how to use glob, a very common, important but easily neglected python package.
Please check this [link](https://pynative.com/python-glob/#:~:text=Python%20glob.,UNIX%20shell%2Dstyle%20wildcards) for more info.

In [1]:
import glob

#### 1. Asterisk (*)
_Asterisk *_ Matches zero or more characters

Q1: How to find all the __ipynb__ files?

In [7]:
print(glob.glob('*.ipynb'))

['glob_tutorial.ipynb', 'os_tutorial.ipynb']


Q2: How to find the __txt__ file in CommandLines folder?

In [9]:
path = r"C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife\CommandLines"
glob.glob(path + '/*.txt')

['C:\\Users\\Shizh\\OneDrive - Maastricht University\\Code\\Spoon-Knife\\CommandLines\\Git_code_ssh.txt',
 'C:\\Users\\Shizh\\OneDrive - Maastricht University\\Code\\Spoon-Knife\\CommandLines\\Unzip_Tar_cmd.txt']

Or to do it recursively!

> The glob module supports the __**__ directive. When you set a recursive flag to True, the glob method parses the given path look recursively in the directories.

In [14]:
# use the path for the whole directiory
dir = r"C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife"
for file in glob.glob(dir + '/**/*.txt', recursive=True):
    print(file)

C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife\CommandLines\Git_code_ssh.txt
C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife\CommandLines\Unzip_Tar_cmd.txt


Q3: How to find a file name always including ''tutorial''?

In [19]:
print(glob.glob('*_tutorial*'))

['glob_tutorial.ipynb', 'os_tutorial.ipynb']


#### 2. Question Mark (?)

Question Mark __?__ Match Single character in File Name

Q1: How to find a file path with only __4__ charactors before *'_tutorial'*?

In [20]:
print(glob.glob('????_tutorial*'))

['glob_tutorial.ipynb']


#### 3. square brackets ([])
square brackets ([]) can contain a range of characters or numbers as the search string

Q1: How to find a file with a start character from [a-h] and an ending character from [a-c]?


In [29]:
dir = r"C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife"
for file in glob.glob(dir + '/**/[f-h]*.*[a-c]', recursive=True):
    print(file)

C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife\GeneralTutorials\glob_tutorial.ipynb


#### 4. iglob() as an iterator

_glob.glob()_ return a list of matched files, while __glob.iglob()__ return an iterator of matched files.

This methed is good when the list is too big to store in your RAM.

In [32]:
path = r"C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife\CommandLines"
print(f"golb.glob will return a list: {glob.glob(path + '/*.txt')}")

print('but ')
for i, pth in enumerate(glob.iglob(path + '/*.txt')):
    print(f'the path of file number {i} is {pth}')

golb.glob will return a list: ['C:\\Users\\Shizh\\OneDrive - Maastricht University\\Code\\Spoon-Knife\\CommandLines\\Git_code_ssh.txt', 'C:\\Users\\Shizh\\OneDrive - Maastricht University\\Code\\Spoon-Knife\\CommandLines\\Unzip_Tar_cmd.txt']
but 
the path of file number 0 is C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife\CommandLines\Git_code_ssh.txt
the path of file number 1 is C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife\CommandLines\Unzip_Tar_cmd.txt


#### 5. escape() for Special Characters 


Search for Filenames with Special Characters ($,&,@,_,-) using escape() method

In [33]:
char_seq = "_$#-"
for char in char_seq:
    esc_set = "*" + glob.escape(char) + "*" + ".ipynb"
    for file in (glob.glob(esc_set)):
        print(file)

glob_tutorial.ipynb
os_tutorial.ipynb


#### 6. Multiple Extensions
We can search files having different extensions using the glob module.


In [37]:
path = r"C:\Users\Shizh\OneDrive - Maastricht University\Code\Spoon-Knife"

print("All ipynb and txt files")
extensions = ('*.ipynb', '*.txt')
files_list = []
for ext in extensions:
    files_list.extend(glob.glob(path + '/**/' + ext))
print(files_list)

All ipynb and txt files
['C:\\Users\\Shizh\\OneDrive - Maastricht University\\Code\\Spoon-Knife\\GeneralTutorials\\glob_tutorial.ipynb', 'C:\\Users\\Shizh\\OneDrive - Maastricht University\\Code\\Spoon-Knife\\GeneralTutorials\\os_tutorial.ipynb', 'C:\\Users\\Shizh\\OneDrive - Maastricht University\\Code\\Spoon-Knife\\CommandLines\\Git_code_ssh.txt', 'C:\\Users\\Shizh\\OneDrive - Maastricht University\\Code\\Spoon-Knife\\CommandLines\\Unzip_Tar_cmd.txt']
