# Managing Files and Folder using Python 



### Professional Organization

Staying organized is paramount.

Here's a professional structure (create your own, but stay consistent):

```dataProjects --> yyyy-project-name```

Within each project folder include

- ```raw``` folder
- ```output``` folder




#### As we create and tap files for analysis, we need to stay organized programmatically.

- Let's understand the folder/director structure on our computers.

- We'll use the ```os module``` and ```pathlib``` to create, navigate and delete files and folders programmatically.

- We'll also use command line/UNIX commands like ```ls```, ```cd``` and ```mkdir```.

- [Download the sample files](https://drive.google.com/file/d/1lHEMs5mVhNKXsZHYwE_ww_AhWdFK8WPI/view?usp=share_link) we will need.

- For now, place the unzipped folder's content into your ```raw``` folder inside the newly created project folder.

In [2]:
## import libraries
import os  ## allows you to navigate, create, delete folders
from pathlib import Path ## allows to create paths to files and folders
import shutil ## To empty a directory with files in it, we use another library called shutil
import pandas as pd

## UNIX Command Line v. Programmatic Command Line

### NOTE: Do pure UNIX commands in your ```Terminal``` application.

- ```UNIX``` commands are done manually.

#### Programmatic Folder/Files Management

 
- We make this Python scriptable by using the ```os module```.

UNIX v. Python <a href="https://docs.google.com/spreadsheets/d/1J7CVJgrYWh6xQMe4LzBxn2_ZNrSedo4S6vIN8Ct2EtE/edit?usp=sharing">commands</a>





## Where am I?

#### UNIX - ```pwd```
#### os - ```os.getcwd() ```

In [3]:
## os
os.getcwd()

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo'

## List content of directory

#### UNIX - ```ls```
#### os - ```os.listdir()``` 

In [5]:
ls

LICENSE
README.md
[1m[36moutput[m[m/
[1m[36mraw-data[m[m/
wk-08A-filefolder-management-DEMO.ipynb


In [4]:
os.listdir()

['raw-data',
 '.DS_Store',
 'LICENSE',
 'output',
 'README.md',
 'wk-08A-filefolder-management-DEMO.ipynb',
 '.gitignore',
 '.ipynb_checkpoints',
 '.git']

## Change directories

#### UNIX -  ```cd```
#### os - ```os.chdir("name-of-directory")```

let's enter our ```raw``` folder in our project folde

In [7]:
## change directory
os.chdir("raw-data")

In [8]:
## where am I now?
os.getcwd()

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo/raw-data'

## What does this folder hold?

In [9]:
## list content of this directory
os.listdir()

['fla_count_5.csv',
 'fla_count_4.csv',
 'fla_count_3csv.csv',
 'fla_count_2.csv',
 'fla_count_1.csv',
 'adolph-coors-2015.pdf',
 'adolph-coors-2014.pdf',
 'adolph-coors-2013.pdf',
 'read_sample1.txt']

## Back out of folder to the enclosing folder

#### UNIX - ```cd ..```
#### os - ```os.chdir("..")```

In [10]:
## back out
os.chdir("..")

In [11]:
### - Where am I?
os.getcwd() 

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo'

In [13]:
### - let's return to our ```raw``` folder in our project folder
os.chdir("raw-data")

In [14]:
### - Let's confirm where we are:
os.getcwd() 


'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo/raw-data'

In [15]:
### - list the content of this folder again
os.listdir()

['fla_count_5.csv',
 'fla_count_4.csv',
 'fla_count_3csv.csv',
 'fla_count_2.csv',
 'fla_count_1.csv',
 'adolph-coors-2015.pdf',
 'adolph-coors-2014.pdf',
 'adolph-coors-2013.pdf',
 'read_sample1.txt']

In [16]:
os.chdir("..")
os.getcwd() 

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo'

## Generating folders
#### UNIX - ```mkdir folder-name```
#### os - ```Path('folder_name/').mkdir(exist_ok=True)```

In [17]:
## create here without variable name
Path("dummy-folder").mkdir(exist_ok = True)

In [18]:
os.listdir()

['raw-data',
 '.DS_Store',
 'LICENSE',
 'output',
 'README.md',
 'wk-08A-filefolder-management-DEMO.ipynb',
 '.gitignore',
 '.ipynb_checkpoints',
 '.git',
 'dummy-folder']

In [19]:
### create junk_folder

Path("junk-folder").mkdir(exist_ok = True)

In [20]:
os.listdir()

['raw-data',
 '.DS_Store',
 'LICENSE',
 'output',
 'junk-folder',
 'README.md',
 'wk-08A-filefolder-management-DEMO.ipynb',
 '.gitignore',
 '.ipynb_checkpoints',
 '.git',
 'dummy-folder']

### Always better to use variables so we can generate folders programmatically

In [22]:
## create a path to folder called some_new_folder
## we store that path in a variable called my_new_directory
new_dir = Path("scraps")

In [23]:
## create that directory
## exists_ok=True checks to see if the folder already exists
new_dir.mkdir(exist_ok = True)

### List contents of folder now

In [25]:
## show list programmatically
os.listdir()

['raw-data',
 '.DS_Store',
 'LICENSE',
 'output',
 'junk-folder',
 'README.md',
 'wk-08A-filefolder-management-DEMO.ipynb',
 '.gitignore',
 '.ipynb_checkpoints',
 '.git',
 'dummy-folder',
 'scraps']

## Manually add some junk to the junk folder and move into that folder

Only then do the next step

In [26]:
os.getcwd()

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo'

In [27]:
## move into junk folder
os.chdir("junk-folder")
os.getcwd()

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo/junk-folder'

In [28]:
## list content
os.listdir()

['fla_count_2.csv',
 'fla_count_1.csv',
 'adolph-coors-2015.pdf',
 'adolph-coors-2014.pdf',
 'adolph-coors-2013.pdf']

In [33]:
os.chdir("../../..")

In [34]:
os.getcwd()

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads'

In [35]:
os.chdir("dataProjects/2022-os-file-demo/junk-folder")
os.getcwd()

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo/junk-folder'

## Delete a file

#### UNIX - rm filename
#### os - os.remove("filename")

In [29]:
## remove fla_count_2.csv
os.remove("fla_count_2.csv")

In [30]:
## is it still there?
os.listdir()

['fla_count_1.csv',
 'adolph-coors-2015.pdf',
 'adolph-coors-2014.pdf',
 'adolph-coors-2013.pdf']

## Delete a folder

1. You can't delete a folder with stuff in it (without a trick)!
2. You can't be in the folder you are trying to delete!

In [38]:
os.chdir("..")
os.getcwd()

'/Users/sandeepjunnarkar/Dropbox/Mac/Downloads/dataProjects/2022-os-file-demo'

In [39]:
os.listdir()

['raw-data',
 '.DS_Store',
 'LICENSE',
 'output',
 'junk-folder',
 'README.md',
 'wk-08A-filefolder-management-DEMO.ipynb',
 '.gitignore',
 '.ipynb_checkpoints',
 '.git',
 'dummy-folder',
 'scraps']

In [40]:
## remove an empty directory
## NOTE: This only removes empty directories
os.rmdir("dummy-folder")

In [41]:
## where we?
os.listdir()

['raw-data',
 '.DS_Store',
 'LICENSE',
 'output',
 'junk-folder',
 'README.md',
 'wk-08A-filefolder-management-DEMO.ipynb',
 '.gitignore',
 '.ipynb_checkpoints',
 '.git',
 'scraps']

In [None]:
## change directory


In [None]:
## show directory now programmatically


In [42]:
## delete dummy folder
os.rmdir("junk-folder")

OSError: [Errno 66] Directory not empty: 'junk-folder'

In [None]:
## show directory now programmatically


In [None]:
## move program into junk folder programmatically



In [None]:
### list directory


In [None]:
## confirm location


## Delete junk_folder (this will break)

In [43]:
## delete junk folder - will break
os.rmdir("junk-folder")

OSError: [Errno 66] Directory not empty: 'junk-folder'

In [None]:
## where am i?


In [None]:
## move out of junk folder


In [None]:
## show directory now USING OS


In [44]:
## Now delete all contents
shutil.rmtree("junk-folder")

In [45]:
## show directory now USING OS
os.listdir()

['raw-data',
 '.DS_Store',
 'LICENSE',
 'output',
 'README.md',
 'wk-08A-filefolder-management-DEMO.ipynb',
 '.gitignore',
 '.ipynb_checkpoints',
 '.git',
 'scraps']