Welcome to Enrichment Lesson A of the Noisebridge Python Class ([Noisebridge Wiki](https://www.noisebridge.net/wiki/PyClass) | [Github](https://github.com/audiodude/PythonClass))

Here we'll be taking a break from drilling language features and data structures, and trying to see some of the more interesting things you can do with Python.

In this case, we will be organizing our mp3 and book libraries using Python **file system operations**.

If you've used a computer, you're familiar with **files**. Files live on a file system, which is on some kind of device like a hard drive or USB stick. Within the file system, there are **directories** (often called 'folders' in GUI operating systems like Windows and macOS), which can contain files and other directories.

With Python, we can do all the things with files that we can do within our operating system: open them and write to them (which we have already seen), check if they exist, copy and rename them, etc.

# Our directories 

Alongside this notebook are two directories: `books` and `music`.

The books directory contains directories for authors, along with directories for each of the ebooks we have for them. The ebook directories themselves contain ebooks in various formats plus some other files like cover art.

The `music` directory contains 3 sub directories with a bunch of songs randomly collected in them, with no information on artist, album or song title. However we suspect that this information may be encoded in the files themselves as [ID3 metadata](https://en.wikipedia.org/wiki/ID3).

# 'Walking' the directories

We will be using the Python 3 `Pathlib` library to perform our filesystem operation. You can [read more about it](https://docs.python.org/3/library/pathlib.html) if you like. It provides us with high level operations so that we can manipulate **paths** (which are like URLs for files on your computer, like '/Users/tmoney/code/PythonClass/lessons/lesson_1.ipynb').

First, if we refer to a path without any slashes, Python assumes we are talking about a directory or file that is in the same directory as our running script or, in this case, Jupyter notebook.

In [None]:
from pathlib import Path

books_dir = Path('books')
print('books dir exists:', books_dir.exists())
print('books dir is a directory:', books_dir.is_dir())
print('books dir is a file:', books_dir.is_file())
print()

banana_file = books_dir / 'banana.txt'
print('banana_file exists:', banana_file.exists())

We can use the `glob()` function to find files or directories that are under a given path. We pass `glob()` a pattern, and only items that match the pattern are returned. Here, `*` represents "anything". Glob is kind of like a simple search.

In [None]:
for file_or_dir in books_dir.glob('pg*'):
  print(file_or_dir)

print()

for file_or_dir in books_dir.glob('*6*'):
  print(file_or_dir)

We can also use the `iterdir()` method, which will return a Path object for every directory and file in the directory.

In [None]:
for item in books_dir.iterdir():
  print(item, item.is_dir(), item.is_file())

At this point, if we wanted to find ALL of the files and directories based on a given point, we can imagine processing directories and files separately as we iterdir over and over.

In [None]:
for item in books_dir.iterdir():
  if item.is_dir():
    for item_2 in item.iterdir():
      if item_2.is_dir():
        for item_3 in item_2.iterdir():
          print(item_3)

          # This is getting out of hand

If we really wanted to find all of the directories and subdirectories using `iterdir()`, we would probably use **recursion**, which is a topic we might explore in a later lesson.

In [None]:
def process_dir(dir):
  for item in dir.iterdir():
    if item.is_dir():
      process_dir(item)
    else:
      print(item)

Luckily, we have a method called `walk()` which lets us expand a directory and get back all of the sub-directories, files, their sub-directories and files, etc.

In [None]:
for root, dirs, files in books_dir.walk():
  print(f'''== In {root} ==
  Directories: {dirs}
  Files: {files}
  ''')

Now let's try collecting all of our epubs into a new folder called 'epubs'. We will use the `copyfile` method from the `shutil` module in order to copy the files over (we don't want to accidentally mess up or delete our original files).

In [None]:
import shutil

books_dir = Path('books')

out_epub_dir = Path('books') / 'epubs'
out_epub_dir.mkdir(exist_ok=True)  # Create the new directory

# The double star in the glob means "with any number of directories in between"
for item in books_dir.glob('**/*.epub'):
  shutil.copy(item, out_epub_dir)


# Exercise - only those with images

This works, but we get multiple epubs for each book. Can you modify the above script so that it only copies the epub versions that have images (the ones with 'images' in the filename)?

# Music

Our music folder is a much bigger mess. Here we have songs from multiple different artists, in 3 random folders, with no information about what artist or album is which. Let's try to extract ID3 metadata from the mp3 files so we can create a hierarchy of Artist -> Album -> Songs, and give the songs names based on their track number and title. We will use the [eyeD3 library](http://eyed3.readthedocs.io/en/latest/) for this.

In [None]:
import eyed3
from pathlib import Path
import shutil

music_path = Path('music')
out_path = Path('library')
out_path.mkdir(exist_ok=True)

def get_metadata(filepath):
  audiofile = eyed3.load(filepath)
  if audiofile.tag:
      metadata = {
          'artist': audiofile.tag.artist,
          'album': audiofile.tag.album,
          'title': audiofile.tag.title,
          'track_num': audiofile.tag.track_num[0] if audiofile.tag.track_num else ''
      }
  return metadata

# Your code here

# Exercise - Sorting the music library

Given the function above that extracts the metadata, try writing a function that collects all of the mp3 files and puts them in the right place. So for the following:

```
{
    'artist': 'Foo Fighters',
    'album': 'There Is Nothing Left To Lose',
    'title': 'Learn To Fly',
    'track_num': 3,
}
```

This song should be copied from `songs_002/song_010.mp3` (or wherever it is) to `library/Foo Fighters/There Is Nothing Left To Lose/03 - Learn to Fly.mp3`.

You can add your code to the code block above.