# Bonus

Suppose we have thousands of files with US-style dates in their files names `(MM-DD-YYYY)` and we want to rename to EU-style dates `(DD-MM-YYYY)`. This task can take all day to do by hand...and boring! We can write a program to do it instead.

### Regular Expression -- The nuke to manipulate strings/text

### A small Python script to rename files

In [2]:
import shutil, os, re

In [3]:
target_dir = "/Users/jin/JupyterProjects/acf701/us_to_eu/"

#### Let's have a look what's inside our target folder

In [4]:
for f in os.listdir(target_dir):
    print(f)

.DS_Store
27-04-2018.txt
accounting26-04-2018.txt
accounting26-04-2018copy.txt
accounting27-04-2018.txt
dont_rename_me.txt
finance26-04-2018.csv
lancaster26-04-2018.txt


#### Step 1 create a regex that matches files name with US data format

In [5]:
# regex pattern
usPattern = re.compile(r"""^(.*?)((0|1)?\d)-((0|1|2|3)?\d)-((19|20)\d\d)(.*?)$""", re.VERBOSE)

- `^(.*?)`: any text before the date, `^`: start of the text
- `((0|1)?\d)`: one or two digits for the month, 0 is optional, `\d` refers to any one digit number from 0 to 9
- `((0|1|2|3)?\d)`: one or two digits for the day, 0 is optional
- `((19|20)\d\d)`: four digits for the year, 19 or 20 is optional
- `(.*?)$`: any text after the date, `$`: end of the text

#### Step 2 collect date information from original filename
#### Step 3 rename all files

In [7]:
# loop through all files under a target folder
for usFilename in os.listdir(target_dir):
    # local variable for each file
    dateInfo = usPattern.search(usFilename)
    # skip files without a date
    if dateInfo == None:
        print("*WARNING*: filename is not matched")
        continue
    # collect different parts of the filename
    beforePart = dateInfo.group(1) # (.*?)
    monthPart  = dateInfo.group(2) # the whole group ((0|1)?\d)
    dayPart    = dateInfo.group(4) # the whole group ((0|1|2|3)?\d)
    yearPart   = dateInfo.group(6) # the whole group ((19|20)\d\d)
    afterPart  = dateInfo.group(8) # (.*?)
    
    # construct EU-style date filename
    euFilename = beforePart + dayPart + '-' + monthPart + '-' + yearPart + afterPart
    
    # contruct full path for each file: location + filename
    usFilepath = os.path.join(target_dir, usFilename)
    euFilepath = os.path.join(target_dir, euFilename)
    
    # rename files
    print("renaming {} to {}...".format(usFilename, euFilename))
    shutil.move(usFilepath, euFilepath)

renaming 204-7-2018.txt to 27-04-2018.txt...
renaming accounting204-6-2018.txt to accounting26-04-2018.txt...
renaming accounting204-6-2018copy.txt to accounting26-04-2018copy.txt...
renaming accounting204-7-2018.txt to accounting27-04-2018.txt...
renaming finance204-6-2018.csv to finance26-04-2018.csv...
renaming lancaster204-6-2018.txt to lancaster26-04-2018.txt...


### Can you improve this script to rename all files under a directory tree? Hint: use `os.walk()` function?