# Introduction
This material focuses on how python interacts with directory structures.  This exercises focuses on using the built-in `os` module to analyze the files and folders and do some basic operations.  It should help you cement your understanding of the differences between files, folders, and directory traversal.

```{warning} 
[WARNING] The code block below will create files and folders for you.  It will also truncate any existing files with the same names.   Please look at it carefully!
```

In [1]:
# Read through this block before executing the code and then execute it to create the sample file structure for practice below.

import os
structure = {
    "project_root": {
        "docs": {
            "notes.txt": "Python is fun",
            "todo.txt": "Complete project 1",
            "old": {
                "archive.docx": "Nothing here",
            },
        },
        "src": {
            "main.py": "print('hello')",
            "helper.py": "print('I am helping')",
            "utils": {
                "calc.py": "pass",
                "strings.py": "pass",
            },
        },
        "data": {
            "input.csv": "no data",
            "output.csv": "no data",
            "images": {
                "logo.png": "binary stuff",
                "chart.jpg": "binary stuff again",
                "old": {
                    "draft.png": "even more binary stuff",
                },
            },
        },
        "README.md": "Use the jupyter notebook to solve some practice problems",
    }
}


def create_structure(base_path, structure):
    for name, content in structure.items():
        path = os.path.join(base_path, name)
        if isinstance(content, dict):
            os.makedirs(path, exist_ok=True)
            create_structure(path, content)
        else:
            with open(path, "w") as f:
                f.write(content)


create_structure(".", structure)
print("Sample directory structure created!")

# os.chdir('project_root')

Sample directory structure created!


## Sample Structure

If you run the code snippet above, your sample directory structure will look like this.  Solve the problems below using this directory structure as a reference.

```
project_root/
├── docs/
│   ├── notes.txt
│   ├── todo.txt
│   └── old/
│       └── archive.docx
├── src/
│   ├── main.py
│   ├── helper.py
│   └── utils/
│       ├── calc.py
│       └── strings.py
├── data/
│   ├── input.csv
│   ├── output.csv
│   └── images/
│       ├── logo.png
│       ├── chart.jpg
│       └── old/
│           └── draft.png
└── README.md
```

# Problem 0
Use the `os.listdir()` method to see which folders and files are available in the current directory.

What do you notice about the output?  Is there any difference between a file and a folder?

Hint: you may need to use `os.chdir()` to get to the correct directory.  You can use `os.pwd()` to see which directory your interpreter is currently running in.

['data', 'docs', 'README.md', 'src']

# Problem 1
Write a function to list only subdirectories for the given directory.  

Hint:  Use the `os.path.isdir()` method along with `os.listdir()` to only print out the directories.

Bonus:  Write another function that only gives you files.


In [None]:
def list_subdirectories(path):
    pass

# assert list_subdirectories('data') == ['images']
# assert sorted(list_subdirectories('.')) == ['data', 'docs', 'src']

print(list_subdirectories('.'))

['data', 'docs', 'src']


# Problem 2

Use your function from above to get all the subdirectories for a given path.  Each subdirectory should have the parent path included.  Your output for the current directory should look something like this (you might need to replace `project_root` with `.`):

Hint: you will have to get a little creative about how you handle navigating the subfolders.  I would suggest using a `while` loop and `append`ing and `pop`ping items off of a list.  (If you're a recursion fan, you can do that too.)

```
project_root/
project_root/docs/
project_root/docs/old/
project_root/src/
project_root/src/utils/
project_root/data/
project_root/data/images/
project_root/data/images/old/
```

In [None]:
def sample_loop_processing():
    # this is just an idea to get you started

    # our final result
    result = []

    # a list of things that we need to process
    to_process = [5]

    # while there are elements left to process...
    while to_process:
        # add this value to our list of results
        next_val = to_process.pop(0)
        result.append(next_val)

        # find any "sub values" and add them to our list to process later
        to_process.extend(range(next_val - 1, 0, -1))
        
        print(f"Added {next_val} to result, remaining left to process is {to_process}")

    return result
final = sample_loop_processing()
print(f"Final result is {final}")

# def list_all_subdirectories(path):
#     pass

# for subdir in list_all_subdirectories('.'):
#     print(subdir)

Added 5 to result, remaining left to process is [4, 3, 2, 1]
Added 4 to result, remaining left to process is [3, 2, 1, 3, 2, 1]
Added 3 to result, remaining left to process is [2, 1, 3, 2, 1, 2, 1]
Added 2 to result, remaining left to process is [1, 3, 2, 1, 2, 1, 1]
Added 1 to result, remaining left to process is [3, 2, 1, 2, 1, 1]
Added 3 to result, remaining left to process is [2, 1, 2, 1, 1, 2, 1]
Added 2 to result, remaining left to process is [1, 2, 1, 1, 2, 1, 1]
Added 1 to result, remaining left to process is [2, 1, 1, 2, 1, 1]
Added 2 to result, remaining left to process is [1, 1, 2, 1, 1, 1]
Added 1 to result, remaining left to process is [1, 2, 1, 1, 1]
Added 1 to result, remaining left to process is [2, 1, 1, 1]
Added 2 to result, remaining left to process is [1, 1, 1, 1]
Added 1 to result, remaining left to process is [1, 1, 1]
Added 1 to result, remaining left to process is [1, 1]
Added 1 to result, remaining left to process is [1]
Added 1 to result, remaining left to pro

# Problem 3
Notice that for the above problem, you need to keep track of where you are and where the subdirectories are as well.  Thankfully, someone solved that problem for us already.  The `os.walk()` method is a *generator* that yields three values: 
 
1. your current directory
2. the current directory's subdirectories
3. and the current directory's subfiles.

Use the `os.walk()` method to solve the problem above.

Hint:  This is easier than it might sound, you only need a single for loop and don't actually have to use the subdirectories or the subfiles at all.

```
.
.\data
.\data\images
.\data\images\old
.\docs
.\docs\old
.\src
.\src\utils
```

In [None]:
for directory, _, _ in os.walk('.'):
    pass

.
.\data
.\data\images
.\data\images\old
.\docs
.\docs\old
.\src
.\src\utils


# Problem 4
Modify your solution to problem 3 by also listing each directory's relevant subdirectories below the directory and indented in one level.

Notice that in the output below, the images folder actually shows up twice.  Once as a subfolder under `.\data` and once again as its own folder: `.\data\images`.

Hint:  In a string, `\t` represents the tab character.

```
.
	data
	docs
	src
.\data
	images
.\data\images
	old
.\data\images\old
.\docs
	old
.\docs\old
.\src
	utils
.\src\utils
```

In [None]:
for directory, subdirectories, _ in os.walk('.'):
    pass

.
	data
	docs
	src
.\data
	images
.\data\images
	old
.\data\images\old
.\docs
	old
.\docs\old
.\src
	utils
.\src\utils


# Problem 5
Modify your solution above to print out the subfiles indented by two levels.  Also include the subfiles' length by using the `os.path.getsize()` method.

In [None]:
for directory, subdirectories, subfiles in os.walk('.'):
    pass